The role of data in data storytelling

Monday August 29th, 2016

Storytelling is a natural human trait. We are hardwired to create and follow narratives; this is how we understand, memorise and recall complex concepts. Telling stories is so natural that a five year old can do it; in fact they often do.

Yet, an (alarmingly) large number of comments and opinions describe in great lengths how people in technical professions are unable to explain or storytell their experiments and findings. Have we regressed that far that something as natural as stories has disappeared from our skillset? Not really.

While people communication abilities vary, when someone says “my technical staff is unable to explain what they do” it actually means “I don’t understand what they do and it’s their fault for not telling me properly”. The issue: context; most scientists and inventors are excellent communicators to an audience of peers. On the other hand, non-technical people expect a narrative that they can follow with their knowledge.

Should a writer adapt to its audience, or should the audience adjust to the writer? The challenge is that there are a limited number of options for people with different background to communicate effectively:
– Non-technical people could learn technical background. While this is unlikely to happen, understanding uncertainty, causality, or Bayesian probabilities should be a prerequisite for anyone aiming to take data-driven decisions
– Eschewing the complexities, and focusing the story on… the story, with picking only some data to play a supporting role while pushing the analytics in the background. This is the preferred communication method of business, politics, news, and snake oil merchants alike.

This latter approach has a problem: we humans are hardwired to understand and be compelled by stories and narratives, but we are pretty rubbish at understanding risk and uncertainty. As a consequence, the narrative completely overrides the data, and is given priority over the data, not letting facts get in the way of a good story. This can be a recipe for disaster, see illustration below.

Clement Fredembach - Storytelling 1

Above: The same data points support a number of different models and “stories” depending on methodology, intent, and audience. Because reality can differ significantly from those models, proper data-centric storytelling must convey not only the story but also underlying assumptions, hypotheses, and context.

“Big data” and analytics allow us to build more resilient and accurate models; it enables the identification of new features and behaviours. But models are, at best, an imperfect representation of reality that can support a number of dissonant narratives.

Being able to disassociate data from story and understanding implicit assumptions made by models requires enough technical skills to make an informed decision[1]. Without these skills, decision makers are asking to be lied to, because constructing a compelling narrative is easy enough that my 5 year old niece can do it, but it doesn’t mean its contents are accurate or indeed truthful.

If the goal is for complex analytics to deliver accurate insights and tangible, long-term value, then data (and data processing techniques) has to be at the centre of the conversation. Storytelling, like data visualisation, is an essential communication tool. But it is not a substitute for understanding the underlying data or science.



[1] This is similar to informed consent in medicine: procedures can be described, but hardly ever truly understood by non medical practitioners.

Category: Clement Fredembach Tags: , , , ,

About Clement Fredembach

Clement is a data scientist with Teradata Australia and New Zealand Advance Analytics group. With a background in Color Science, Computational Photography and Computer Vision, Clement has designed and build perceptual statistical experiments and models for the past 10 years. Clement strives to combine his psychometric, perceptual and statistical knowledge to deliver insights and their story that are understandable and actionable to non-technical audiences. Prior to joining Teradata, Clement collaborated with several Fortune 500 and academic institutions as a researcher, publishing and patenting large portions of of his research along the way. Clement holds an MSc in Communication Systems from EPFL (Switzerland) on Image Classification and a PhD from UEA (UK) on Computational Imaging. His interests range from behavioral psychology to graph theory and photography.

Leave a Reply

Your email address will not be published. Required fields are marked *