Code Savvy Presents: Data Visualization with Python
In this month’s episode of the Code Savvy Presents podcast, we talk with Data Analyst and visualization sherpa, Kathy Hurchla. Kathy gives us the scoop on how she uses the python programming language to create helpful visualizations of all sorts of data.
Kathy covers the importance of understanding data, some of the python libraries that are helpful to use, and how to effectively use visualizations to communicate the impacts of your data, no matter what story it tells.
Watch the full episode to learn more!
Kathy also gave us a TON of resources to share with you:
You can find more about me at https://dadeda.design, the home of Data Design Dimension, and my profile on its About page.
From there, jump to follow my LinkedIn, Medium, or Twitter accounts where I post what I write and other goings-ons.
Keep an eye on fantasy.co—expand your idea about what data, dataviz, and Python can inform and dig up insights for (design!)
Don’t miss Nightingale (https://nightingaledvs.com/), the journal of the Data Visualization Society!
I learn as much from there as I contribute as an author or editorial team member (probably much more!)
“5 Routes for Going from Zero to Viz in Data Science” is a blog post about doing data visualization with Python specifically but some high-level context about visual analytical questions to ask too, which I recently wrote on Anaconda and its Nucleus community blogs (https://www.anaconda.com/blog/zero-to-viz)
https://pyviz.org/ is a community source of information on Python libraries for data visualization, and a great place to see the names of many different libraries and how they are similar and different (https://pyviz.org/overviews/index.html).
Plotly’s community forum is a hub for folks of all levels of experience with Python for dataviz to learn and support each other—I am sometimes active there: https://community.plotly.com/
Learn Plotly and Dash to make analytical data viz charts and web applications with Charming Data’s YouTube channel https://www.youtube.com/c/CharmingData
Search your public library’s website or just ask them if you can access LinkedIn Learning tutorials as a library card holder. Lots of great short courses there.
When it comes to learning Python, go with the most current resources because it’s a programming language that is in rapid active development and many great timesaver updates are released every year or even month with some libraries.
Keep an eye on the date of any course, book, or even blog post was published because things can change quickly.
For books, I like to send folks to choices rather than straight to Amazon. So, here are a few, then below are some search options, up to you, but I’ll just list the titles/authors/deets below. Note in contrast to what I mentioned about Python learning resources when you’re getting more foundational concepts, often an old but good resource is ok!
Database Design for Mere Mortals
Michael J. Hernandex
This is the most recent “25th Anniversary Edition” but if you find an older version, that’s probably just as good. I would wager a bet very little has changed in this relational data space that is foundational but not necessarily cutting edge!
The Cartoon Guide to Statistics
Larry Gonich & Woollcott Smith (on author’s site: http://www.larrygonick.com/titles/science/the-cartoon-guide-to-statistics/)
Not a textbook, but a good stats read along to any other statistics learning you do, like a LinkedIn Learning course or YouTube videos or classes. A good place to go when you keep hearing about a term you don’t really get what people are talking about.
For deeper learning in applicable data/math
Essential Math for Data Science
Take Control of Your Data with Fundamental Linear Algebra, Probability, and Statistics
Thomas Nield (publisher: O’Reilly)
Still, concepts are focused, not used, and include Python code to apply and use the concepts.
Can I find it in a library near me?
By ISBN with options:
Social book search results
(with peer reviews and many different versions/e-books/ways to “Get a copy” on not just Amazon)—keep in mind Amazon owns this social media site for readers, but it’s fun and another way to find book/e-book resources.
In addition to Kathy’s golden nuggets of information within the episode, please check out this Data Visualization with Python overview written by Code Savvy intern, Saanvi Malhotra.
What is Data Visualization?
Data visualization is defined as “the practice of translating information into a visual context, such as a map or graph, to make data easier for the human brain to understand and pull insights from.” While it has roots in the 1700s, the age of technology has truly blown up the proportion with which this technique is used in many different sectors. We primarily use it today to more easily identify trends and patterns in huge data sets. The use of machine learning algorithms and advanced predictive analytics has made it much easier to communicate the trends and analysis that is received from enormous amounts of data to the everyday user in a visual sense.
Why is Data Visualization so important?
There are a great number of reasons why data visualization has become such a prominent technique in today’s computer-driven world. Some of the many benefits of data visualization include:
Making hard to comprehend data more memorable for users and the audience
Improving the ability to maintain an audience's interest
Making data more accessible and understandable
Improving the audience’s and companies' insights by helping point out trends that would be harder to see in a non-visual context
Examples of Data Visualization:
Data visualization is a tool used in many sectors such as healthcare, IT, politics, and finance. In healthcare, professionals often used something called a choropleth map which aids them in visualizing important data. The main objective of this map is to help healthcare professionals see how a certain variable changes over specific areas of the country. A good example of such a variable would be the mortality rate of lung cancer across the United States.
In finance, experts often used graphs such as candlestick charts to help them analyze how the financial market has changed over time as well as analyze the movement of stock prices over certain periods. These techniques aid these experts in recognizing trends in the financial world.
Politics is also a big field that data visualization is commonly used. A good example of this is if you think of the geographical maps that show how each state, city, and county voted in a particular election. These maps reveal trends that would be hard to see if the data was simply in a numerical format.
Python for Data Visualization
One of the most popular languages in data science today is Python. The language offers many advantages that have drawn many data scientists to it. First, it is very easy to learn and therefore has huge popularity with both those new to coding and people switching over from different coding languages. Python also supports multiple different libraries that make data visualization and data science easy to pursue. Some of these libraries include TensorFlow, NumPy, Pandas, and SciPy. Its readability, numerous helpful libraries, and accessibility have garnered Python a significant data science ecosystem.
An introduction to the Pandas library for data science:
Among the many different libraries that Python has to offer for data science, today we are going to focus on one of the most popular and easy to get started with libraries: Pandas. Pandas is an open-source library that is used in many different fields such as finance, statistics, retail, and many more. Its popularity among a diverse array of fields can be attributed to its easy-to-use data structures, high performance as well as data analysis tools.
The Pandas library can also be utilized with Plotly Express.
Here is an article that would be helpful if you wish to get started with Pandas: https://www.simplilearn.com/tutorials/python-tutorial/python-pandas . If you prefer a video-based format to learn about this library, this popular YouTube tutorial is great for a beginner: https://www.youtube.com/watch?v=vmEHCJofslg&ab_channel=KeithGalli . If you are curious to see how exactly Pandas is used in the real world, here is a great example of YouTuber Keith Galli solving real-world tasks in the field of data science by using Pandas: https://www.youtube.com/watch?v=eMOA1pPVUc4&ab_channel=KeithGalli .
Sources used in this article: