Working with data
The steps I think should be followed when working with data:
1 – find out as much as you can about the data, from its original source
2 – collect and clean the data
3 – explore the data, get to know it
4 – find some stories
5 – tell some stories
As for the visual? Don’t think of it as one visual. There’ll be visuals used in exploring, different, better suited ones for finding stories, and different, simpler ones for telling.
Don’t be tempted to throw all the data at the audience just to prove the process you’ve been through. It could even be that what you’ve found out is in fact best followed up in words, pictures or video.
Thinking as I write, I suppose the above steps are in fact the steps data journalism involves. And this may result in a data visualisation, but it doesn’t have to.
I’ve been finally prompted to write the above by related talks I heard at The Design of Understanding last week. You’ll find more discussion of them below if you’re interested.
In this world of all things data, Michael Blastland is a guiding voice worth listening to. In the context of creating data visualisations people often get so caught up in the process of creating the visual that they forget to spend time finding out about the data upon which it is based. Seeing data pinned down by a visualisation has the effect of making the data appear to be concrete. But if you haven’t bothered to find out more about the provenance of said data then in reality the visualisation is likely to be misleading at best, wrong at worst.
The news panel raised the question of which part of the team should provide the interpretation of the data, so by implication also tackle the validity problem. The current scenario is that, in the face of deadlines and a shortage of data handling expertise, everyone tends to bury their head in the sand on this.
By way of example BERG’s Schooloscope project was put in the spotlight. While it’s an unrivalled success in terms of making a dry, dense table of data digestible and friendly, Blastland pointed out that the margins of uncertainty for a large chunk of the data are so wide that they invalidate it. But no-one had any solutions on how to visualise uncertainty. I feel a conference coming on… The Design of Uncertainty?
Accuracy of the data aside, Jack Schulz of BERG’s analogy for working with data was to treat it as a material – only by knowing its properties will you know how best to work with it. This is something I strongly agree with, as per top of this post.
Hi Lulu,
Whilst no one mentioned it at the time I think the presentation of the data in Berg’s Schooloscope does (perhaps inadvertently) do something to address the uncertainty of the data. By choosing to visualise the schools in quite an informal way – smiley buildings, not using any figures in the initial overview and using only a few bands for each variable eg. roof colour is only one of 3 values – the visualisation suggest a rough guide rather than a definitive set of answers. Whilst this might not be conscious it wouldn’t surprise me if it came about by treating the data as a material, having an understanding of its properties etc. Seems to me like its a start anyway.
Maybe
-T
Thank you Tom.
The man makes a good point, as always:
http://blog.pointlineplane.co.uk