Bar Chart or Histogram?

They look so alike, yet so different. Let’s find out the differences!

Chetna Khanna
5 min readApr 26, 2021
Image by Dil on Unsplash

Bar chart and histogram are two graphs that are commonly used in data analysis. They seem alike, as both have bars to display the data, both have an x-axis and a y-axis. In fact, they look so identical that people often get confused about which one to use when. 🤔

First, let us understand why we need these graphs. 🙋

Analysts create diagrammatic representations of data and provide analysis based on those diagrams. Statistical analysis can be performed well if the data used in the analysis is presented properly. Diagrammatic representation is one of the best, attractive, and widely used ways of presenting data. 🤗 The best part of diagrammatic representation is that it even caters to the audience not familiar with the data and its properties. Yeeeeeeeee! 💃

Enough about diagrammatic representations of data! Let’s now find out the key differences between the two most commonly used diagrammatic representations — bar chart and histogram. We will also plot these graphs in Python so that we can see what input both these graphs take. 😎

Difference you can observe 👀

The fundamental difference between histogram and bar chart which we can find out by seeing the graphs is the gap between the bars. In a histogram, bars are shown touching each other with gaps indicating values that did not occur in the data. However, in a bar chart, all the bars are displayed with gaps in between. The reason lies in the fact that histograms present data using bins and there is no gap between the bins.

Note: The pie chart is an alternative to the bar chart.

Let us now move on and see more fundamental differences between these graphs. 🏃🏻

Key Differences ☝️

  1. A bar chart is used to present categorical data whereas a histogram is used to present quantitative data with ranges of data grouped into bins.
  2. In a bar chart, the x-axis represents the different categories of a factor variable whereas, in a histogram, the x-axis represents values of a single variable on a numeric scale. In simple words, a bar chart is used to compare different categories of data, whereas a histogram is used to show the frequency of numerical data of a single variable (or simply, distribution of data).
  3. We cannot reorder the bars in a histogram whereas reordering bars in a bar graph is not an issue. In a histogram, the ordering indicates the numerical scale and is of special importance. On the other hand, in a bar graph, we can arrange bars based on frequency (ascending or descending), alphabetical order, etc.
  4. In the bar graph, the bars are of the same width. However, in a histogram, though we generally see bars of the same width, the width can vary as well depending on the size of the class intervals.
  5. In a bar chart, elements are taken as individual entities whereas, in a histogram, elements are grouped and considered as ranges.
  6. The x-axis in the case of histograms has a low end and a high end. Thus, while talking about histograms we usually talk about skewness. While in bar charts, the labels on the x-axis are categorical and thus do not have a low end or a high end.

Note: Arranging bars in a bar graph by height is more informative and usually done by data analysts.

So, we talked a lot about the theory behind histograms and bar charts and how they are different from each other. I hope next time we will not get confused about which one to plot when. 🏄🏼

Let us now move to the next section of this article. Here, we will together write the code to plot these graphs using Python’s Matplotlib library. The code will be extremely simple. The only motivation in writing this code is to understand how the requirements to make a bar chart differ from that of a histogram. 😀

Importing the required libraries.

import numpy as np
import matplotlib.pyplot as plt

To plot a histogram, we only need a single vector or array of numbers. The bins are either created by the matplotlib library (or any other library) itself or can be explicitly defined. Bins are nothing but consecutive, non-overlapping intervals and are depicted on the x-axis. We then count the values which fall into each of these intervals.

Let us look at the code to plot a histogram without explicitly defining the bins.

n = 50000#generate random numbers
x_values = np.random.randn(n)
#plot the random numbers as histogram
plt.hist(x=x_values, color="red", edgecolor="black")
plt.show()
Image by author

We can see that matplotlib took the number of bins as 10 and counted numbers falling in each bin. The height of the bars depicts the frequency of data in each bin.

Let us now explicitly define the number of bins as 20 to see the difference.

n_bins = 20plt.hist(x=x_values, bins=n_bins, color="green", edgecolor="black")
plt.show()
Image by author

This time, the number of bins is 20 and the bars are quite thin. You can change this parameter as per your requirements and create your histogram. In addition to this, matplotlib has some great options to beautify your graph.

Let us now move on to the bar chart.

To plot a bar chart, we need the categorical data and the values corresponding to each category. These values represent the height of the bars (that is, the height of each category).

#x values
labels = ["A","Y","C","H","B","L"]
#value corresponding to each label
values = [20,45,76,30,12,50]
#plot the categorical data as bar chart
plt.bar(x=labels, height=values, color="blue", edgecolor="black")
plt.show()
Image by author

As we saw the differences between bar chart and histogram both diagrammatically and theoretically, I am sure that next time when you will have data in your hand, you won’t get confused about which one to plot. 😀

Thank you, everyone, for reading this. Do share your valuable feedback or suggestion. Happy reading! 📗 🖌

--

--

Chetna Khanna
Chetna Khanna

No responses yet