Data Visualisation is Evil

Thursday June 11th, 2015

With apologies to Edward Tufte

What do Jeremy Clarkson and Data Visualisation have in common?

They both are very successful and talk directly to the “heart”, bypassing the cognitive system. They radiate an aura of authority that one has to unravel layer by layer because their very premise could be accurate or faulty. Unfortunately, like most physical nature forces, humans follow the path of least resistance, which, coupled with our era of permanent busyness makes Data Visualisation Evil.

In his short essay “Powerpoint is Evil”, Edward Tufte discusses how Powerpoint has become so ubiquitous and misused that it gives presenters and audience a false sense of narrative (slide presentations are inherently sequential) and understanding (as all complexity is reduced to a short series of bullet points). Data Visualisations are no different: they give an intuitive understanding of what may be a complex problem. They can, in fact, be so convincing that rather than complementing the analysis, data visualisation becomes the analysis.

An example of best practice visualisation is Charles Minard’s classic graph describing Napoleon’s retreat from Russia. It displays complex multi-dimentional information in a digestible manner, while not shying away from the complexity of the data. Most current graphs and visualisations, however, are designed from an aesthetic standpoint. Unfortunately, through carelessness or malice, visualisations can often detract from and override the science and analytics.

To illustrate, see the bar charts below. Both contain the same data and the left hand side one is the default created by Excel. The underlying story one would tell (or expect) by looking at these two graphs is however very different. From an obvious difference between the performances of the 2 algorithms, the right hand side indicates an almost identical performance. This example may seem far-fetched but it is commonly observed at all levels, together with the corresponding, expected story. When a story and its illustration go hand in hand, it takes a strong skeptic to question the outcome and the validity of the analysis.

In a previous blog, I discussed the scientific method and its application in the data analytics workflow. Within that context, data visualisation is no substitute for testing or validation. It is an additional tool to explain the data/model/theory in an easy-to-understand manner, and as such is one more tool in the storytelling toolbox. Because the human brain is so good at visual pattern matching, visualisation can become an unwelcome shortcut of the scientific method, replacing the testing and the validation steps.

Similarly, modern graphical user interfaces hide the complexity of both the usage and the theory of data science. As a result, analytics effectively become a beauty contest where multiple models are scored not only on statistical significance, cross-validation and background hypothesis, but rather on how compelling the output graph looks. The downside is that by the time the analysis is implemented and tested in the field, it is too late to go back, and difficult to point out where the model went wrong.

Going back to Minard’s graph, we see one of the most overlooked and underappreciated properties of great scientific illustrations: they must be seen together with the analysis to be meaningful, which intrinsically prevents them from being separated from the science.

Illustrations can extrude a false sense of confidence in the analysis, but also a false sense of confidence in the results. Compelling visualisations can be made to fit a narrative by emphasising the points one wants to make, however, visualisations are based on data and before making a decision one should always look at the data and ensure answers come from a rigorous application of the scientific method.

The challenges of a visualisation-heavy analytic world are exactly the same as the ones posed by the ubiquity of photographic imagery: the image gets mistaken for the content. Moreover, because of the mass of possible ways to view the data, analytical illustrations, like photographs, have followed a “design” trend, where eye-catching potential and aesthetic aspects are slowly eroding the essence of what data visualisation is: a support to help communicate data and its analysis.

Not convinced? Ask yourself this: when is the last time you/your group or company went for the insights and recommendations from the worse-looking presentation/graphs because the science was better?

Whether we want it or not, data visualisation does matter (too much), but style over substance has never been a long-term success in the analytics space.

Clément Fredembach is a data scientist with Teradata Australia and New Zealand Advance Analytics group. With a background in Colour Science, Computational Photography and Computer Vision, Clement has designed and built perceptual statistical experiments and models for the past 10 years.

The following two tabs change content below.
avatar

Clement Fredembach

Data Scientist at Teradata
Clement is a data scientist with Teradata Australia and New Zealand Advance Analytics group. With a background in Color Science, Computational Photography and Computer Vision, Clement has designed and build perceptual statistical experiments and models for the past 10 years. Clement strives to combine his psychometric, perceptual and statistical knowledge to deliver insights and their story that are understandable and actionable to non-technical audiences. Prior to joining Teradata, Clement collaborated with several Fortune 500 and academic institutions as a researcher, publishing and patenting large portions of of his research along the way. Clement holds an MSc in Communication Systems from EPFL (Switzerland) on Image Classification and a PhD from UEA (UK) on Computational Imaging. His interests range from behavioral psychology to graph theory and photography.
Category: Clement Fredembach Tags: ,
avatar

About Clement Fredembach

Clement is a data scientist with Teradata Australia and New Zealand Advance Analytics group. With a background in Color Science, Computational Photography and Computer Vision, Clement has designed and build perceptual statistical experiments and models for the past 10 years. Clement strives to combine his psychometric, perceptual and statistical knowledge to deliver insights and their story that are understandable and actionable to non-technical audiences. Prior to joining Teradata, Clement collaborated with several Fortune 500 and academic institutions as a researcher, publishing and patenting large portions of of his research along the way. Clement holds an MSc in Communication Systems from EPFL (Switzerland) on Image Classification and a PhD from UEA (UK) on Computational Imaging. His interests range from behavioral psychology to graph theory and photography.

4 thoughts on “Data Visualisation is Evil

  1. avatarMax Galka

    At first I questioned whether Excel actually makes that first chart be default, but indeed it does. Truncating the axis of a bar chart is plainly wrong, and Excel should fix that, though point taken.

    I agree that data visualization can be misleading. But so can any form of communication. Is the point that there is a better way of presenting information or that people just need to be more responsible with how they use it?

    Reply
    1. avatarRoger Fried

      Neither chart is wrong. Visualization is just a tool to tell a story and stories always emphasize one feature over another. If the audience has the context and the storyteller is not trying to deceive then either chart might be perfectly appropriate.

      Reply
    2. avatarClement Fredembach Post author

      I agree with Roger in that neither chart is intrinsically wrong, they just ‘appear’ different.

      It’s not only the party presenting information, but also the party being presented to that should share this awareness. Visualisation (and images in general) are very effective but shouldn’t become a substitute for analysis

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *


*