Charts that Shows how Much to Trust Them

Graphs and charts often mislead by obscuring the unreliability of their source data. But even if a graph-maker wants to do better, it can be hard to present such information intelligibly, without long or technical sidebars. Here is one approach for visually displaying both the primary data, and their reliability, in one graph.


Above is simple bar graph. Let’s imagine the data are missile tests conducted by the military: different guidance systems are installed and fired, and the likelihood of their hitting a target is recorded. B appears the clear winner, while A and C seem to be complete duds.

We might ask how reliable these data are. Various statistics attempt to measure that — from standard deviation to p-values — but a very basic figure remains invaluable: the sample size in each category; that is, the number of data points recorded to create aggregate scores (hence the n= notation).


You can see from the second graph that we should seriously doubt some of these numbers: half the guidance systems have not been tested very thoroughly; and our outstanding outlier, B, has only had a single test!

Typical approaches have many drawbacks

How do we best inform a reader about the reliability of the test results?

  • We cannot present the reader with raw n-values: he does not know what they are, and will ignore them. Graphing them separately does not fundamentally resolve that problem, and could add confusion by appearing like a second set of tests.
  • Statisticians have developed visual apparatus like error bars, but these are far too technical. (And my experience has shown that even scientists believe statistically insignificant results quite readily, so these are no panacea.)
  • A very small data set could be shown completely, in a dot or scatter plot, but large datasets become hopelessly messy (and methods to reduce overplotting have their own drawbacks).

An alternative method

What we need is a visual queue for reliability that is integrated into other graphs, with a form that implies its meaning: that is, there is a visual metaphor at work, so that unreliable results appear less reliable.

In the plot below, we have a set of “fuel gauges” where the size of a gauge indicates how seriously the reading should be taken, and the fill mark indicates its value. I think the dubious nature of B becomes clear, even as its value remains obviously high, while D and E appear equally reliable despite E’s higher value.


What I’ve done is basically combined the two previous graphs. Each category gets a rectangular box and a rectangular fill.

  • The top of the fill is mapped to a y-scale value, like on the regular chart.
  • The size of each box corresponds to sample size. The scaling of the boxes requires some finesse: in this case, I’ve simply made the greatest n-value equivalent to half the graph height (which guarantees it will fit), and scaled the other proportionately. You could also set a scale according to a significance test, or some other standard.
  • The vertical position of the boxes is also adjusted so that the fill values are also reflected as a percentage of the box. In effect, each box has its own scale, but all the box scales are aligned.

Low values can never go below the chart, since their empty boxes extend above them; and vice-versa for the very high values — the full gauge sits below the fill mark. Below you can see a range of values, from 0% to 100%, with an equal sample.


Is redundancy always bad?

Some visualization purists will dislike the redundancy shown here, with percentages reflected in two ways. But this redundancy allows the reader to make both direct comparisons between values, and direct comparisons between reliability scores. And hopefully this can be done intuitively.

There are good reasons to generally strive for economy, and avoid chart-junk. But I think redundancy should be less of a sin when presenting to a lay-audience: repetition sometimes has merits for learning, and if it’s done to achieve a different goal, it’s worth the extra ink. In this case, we gain a visual metaphor for reliability that may require less explanation, and be more relatable to non-specialists.

The next time you need graph intended for a general audience, try to encode reliability data as part of it — and if appropriate, try out this “fuel-gauge” method.