Making the complex simple in data viz

Creating graphics that convey the desired message, are easily interpretable, but also beautiful can be a daunting task. This talk will demonstrate how to use The Grammar of Graphics framework to conceptualize the elements of any graphic, in Python.


When building a data visualization in Python from scratch, we quickly stumble across multiple questions: "What type of plot to use?" "How to scale the axes?" "How many dimensions?" "Should I use colors?" In trying to answer these, we can easily get lost in multiple exploratory trips to Stack Overflow, documentation pages, example galleries, and tutorials, often resulting in code that utilizes three different libraries, is more than 50 lines long, and leaves us with a feeling that our graphic's language is lacking structure.

Part of the problem is the abundance of data visualization packages available in Python and their very different syntactic patterns. Each package uses different ways to change elements like axis features, labels, annotations, titles, or even gridlines. Even the way the input data needs to be structured can differ. As a result, the analyst is often absorbed more by figuring out how to adjust each visual detail, instead of thinking of the graphic as a system of logically structured elements, and mapping the data to each of them separately.

Enter The Grammar of Graphics, a framework conceptualized by L. Wilkinson in 1999, which helps us better understand the underlying structure of every graphic. The talk will introduce the framework by deconstructing a simple chart into its constituent "grammatical elements": Aesthetics, Algebra, Scales, Statistics, Geometry, and Coordinates. I will discuss each and explain how every element directly translates to the decisions we take when designing a graphic.

Then, I will practically demonstrate how plotting with a grammar can be highly liberating, as it makes otherwise complex plots easy to think about and then to create. Even though The Grammar of Graphics (and its sister The Layered Grammar of Graphics, Wickham 2010) is most famously implemented in R's ggplot2, that doesn't mean the framework is language-specific – it can be used with any Python visualization package. To demonstrate, I will show an application of the framework by building a chart in the Python plotnine package, and then explaining how we can use the grammar as a guide to let us build the same chart in matplotlib – one grammatical element at a time.


Domains:

Data Science, Visualisation

Domain Expertise:

some

Python Skill Level:

basic

Abstract as a tweet:

Creating graphics that convey the desired message, are easily interpretable, but also beautiful can be a daunting task. Come to this talk to learn how to use The Grammar of Graphics to make any complex graphic simple, in Python.