Category: Explaining Data Visualizations


Why is data visualization necessary?

Okay…tell me what is happening to the debt level of the United States government over time.

Is it rising? Falling? Staying the same?

Source: Federal Reserve Bank of St. Louis

You may have been able to give me an answer by looking at the data. But it probably took you a while. Also, there are more data outside of the screenshot, so the answer you would have given would have been incomplete.

Furthermore this was a small, simple dataset. How would you have done it if the dataset had millions of data in it? What if there were one hundred categories? If the numbers weren’t as easy to calculate?

The approach you took to solve the above problem is impractical for the huge amount of data that we encounter in the real world.

Okay, now tell me what’s happening to the level of U.S. debt over time.

Ahhhh, much better isn’t it?

The awesome thing about data visualizations are that they can tell you information about datasets, a collection of data, with hundreds, thousands, millions or more data.

Software like Excel has become so sophisticated that you can visualize large amounts of data with a few clicks, allowing you to figure things out about your data that were hidden when all you saw was a sea of information.

And sometimes, that visualization is necessary.

You might not feel the need to look at visualization. Maybe you skim the executive summary of the reports you get, or rely on the summary functions that Excel has to get to know a dataset.

Those aren’t enough.

To know your data, you have to see it.


Anscombe’s quartet

To show this, let’s look at the four datasets below.

They have the same summary statistics.

 Don’t worry about what the property terms are. They’re just ways of describing the characteristics of data. You might be tempted to say these datasets are similar because the statistics are similar. But when you graph them…

Oh. Wow.

The data are very different from one another!

I didn’t make these numbers up. This is a famous set of data called Anscombe’s Quartet, created by statistician Francis Anscombe in 1973, underscores the need to make charts of your data in order to properly understand it.

In your job, in just looking at the summary statistics of the dataset, you might make a decision that doesn’t reflect the data. That decision might not be very good, or worse, it could be harmful.



With visualizations, you see your data. You understand its shape. You see the relationships between the variables. All the information present in the visualization helps you to get to know your data on a deeper level that just looking at it on a spreadsheet.

All this is why visualizations are important, and spending time to understand them is very beneficial to get to know the data they are made from.