Makeover
dataviz
chart-challenge
TIL
#30DayChartChallenge Day 3
Note
This post is my contribution to this year’s #30DayChartChallenge. Check out my Day 1 post to learn more.
Makeover
Makeover day has arrived, folks! Time to dust off those old relics from the ancient data catacombs and give them a fresh coat of paint. Today’s lucky candidate? A vintage scatter matrix plot taken from an introductory Data Science course I took many years ago.
The original scatter matrix plot, seen in the “Before” image, uses both histograms and scatter plots in a 4 panel chart to highlight the relationship between pairs of numerical variables. I would like to have a go at sprucing this chart up to make the insight the chart wants to share clearer.
Code
from lets_plot import *
from lets_plot.bistro.joint import *
LetsPlot.setup_html()
drinks['continent'] = drinks['continent'].replace({
'AS' : 'Asia',
'EU' : 'Europe',
'AF' : 'Africa',
'NAm' : "North America",
"SA" : "South America",
"OC": "Oceania"
})
(
joint_plot(
data=drinks,
x='beer_servings',
y='spirit_servings',
color_by='continent',
reg_line = False
)
+ theme_minimal2()
+ labs(
title = 'Distribution of average Beer vs Spirit servings across Continents',
x = 'Beer Servings',
y = 'Spirit Servings',
caption = '#30DayChartChallenge #Day3 Makeover\nData: General Assembly DS Course\nMade by: www.ddanieltan.com'
)
+ theme(
legend_position='top',
plot_caption=element_text(size=12, color='grey'),
plot_title=element_text(size=18,),
)
+ scale_color_brewer(palette='Dark2', guide=guide_legend(nrow=2))
+ ggsize(width=700,height=600)
)
Improvements made
- Graphics are sharper courtesy of Lets-Plot’s more modern rendering engine
- I like the use of a single panel instead of 4 for clarity
- The line distributions in the margin of the chart mirror the relationship shown in the scatter plot, where we see most countries have low beer and spirit servings
- The additional layer of mapping colour to continent adds more insights to the chart
- Proper titles, axis labels and legends make a dramatic difference to readability
TIL
- I learnt the updated syntax to call Panda’s scatter matrix via its documentation
- I had the chance to explore Lets-Plot’s bistro plots documentation
- I discovered the
guides
layer for Lets-Plot, although for today’s plot, I decided to pass my guide param into the scale layer for brevity documentation
Reuse
Citation
BibTeX citation:
@online{tan2024,
author = {Tan, Daniel},
title = {Makeover},
date = {2024-04-03},
url = {https://www.ddanieltan.com/posts/30-day-chart-3},
langid = {en}
}
For attribution, please cite this work as:
Tan, Daniel. 2024. “Makeover.” April 3, 2024. https://www.ddanieltan.com/posts/30-day-chart-3.