Example of how to create and plot a simple histogram with matplotlib and python from a dataset:
Plot a simple histogram using matplotlib
A simple histogram can be created with matplotlib using the function hist(), example:
Note: see for example Histograms vs. Bar Charts to understand the differences between the 2 plots.
import matplotlib.pyplot as plt
data = [1,2,2,3,3,3,4,4,5]
plt.hist(data)
plt.title('How to plot a simple histogram in matplotlib ?', fontsize=10)
plt.savefig("plot_simple_histogramme_matplotlib_01.png")
plt.show()
Histogram normalisation
To normalize the histogram, just add the option "normed", example histogram normalisation to 1
plt.hist(data, normed=1)
plt.title('How to plot a simple histogram in matplotlib ?', fontsize=10)
plt.savefig("plot_simple_histogramme_matplotlib_02.png")
plt.show()
Define the number of classes
By default the number of classes are determined by matplotlib, to define the number of classes just use the option "bins", example with bin = 2:
plt.hist(data, bins = 2)
plt.title('How to plot a simple histogram in matplotlib ?', fontsize=10)
plt.savefig("plot_simple_histogramme_matplotlib_03.png")
plt.show()
Define the number of classes and intervals
It is also possible to define the number of classes and intervals by providing a list to the option bins instead of just a number. For example with bin = [0,2,4,6] (with have 3 classes of interval [0,2[; [2,4[ et [4,6[)
plt.hist(data, bins = [0,2,4,6])
plt.title('How to plot a simple histogram in matplotlib ?', fontsize=10)
plt.savefig("plot_simple_histogramme_matplotlib_04.png")
plt.show()
It is interesting to compare with another histogram, example with another dataset:
data = [1,1,2,3,3,4,4,5,5]
plt.hist(data, bins = [0,2,4,6])
plt.title('How to plot a simple histogram in matplotlib ?', fontsize=10)
plt.savefig("plot_simple_histogramme_matplotlib_04.png")
plt.show()
Retrievals histogram parameters
In addition the function plt.hist() can return all the parameters of the histogram, example:
n, bins, patches = plt.hist(data, bins=[0,2,4,6])
print(n)
print(bins)
print(patches)
returns
[ 2. 3. 4.]
[0 2 4 6]
<a list of 3 Patch objects>
where n is the class frequencies, bins the intervals and patches the histogram formatting parameters (matplotlib.patches.Patch) for more information.
References
Links | Site |
---|---|
hist() | matplotlib doc |
bar | matplotlib doc |
histogramme | wikipedia |
diagramme en bâtons | wikipedia |
pylab_examples example code: histogram_demo.py | matplotlib doc |
pylab_examples example code: histogram_demo_extended.py | matplotlib doc |
Matplotlib - label each bin | stackoverflow |