Examples of how to calculate and plot a cumulative distribution function in python
Table of contents
1 -- Generate random numbers
Let's for example generate random numbers from a normal distribution:
import numpy as np
import matplotlib.pyplot as plt
N = 100000
data = np.random.randn(N)
2 -- Create an histogram with matplotlib
hx, hy, _ = plt.hist(data, bins=50, normed=1,color="lightblue")
plt.ylim(0.0,max(hx)+0.05)
plt.title('Generate random numbers \n from a standard normal distribution with python')
plt.grid()
plt.savefig("cumulative_density_distribution_01.png", bbox_inches='tight')
#plt.show()
plt.close()
3 -- Option 1: Calculate the cumulative distribution function using the histogram
dx = hy[1] - hy[0]
F1 = np.cumsum(hx)*dx
plt.plot(hy[1:], F1)
plt.title('How to calculate and plot a cumulative distribution function ?')
plt.savefig("cumulative_density_distribution_02.png", bbox_inches='tight')
plt.close()
4 -- Option 2: Sort the data
X2 = np.sort(data)
F2 = np.array(range(N))/float(N)
plt.plot(X2, F2)
plt.title('How to calculate and plot a cumulative distribution function ?')
plt.savefig("cumulative_density_distribution_03.png", bbox_inches='tight')
plt.close()
4 -- Using the function cdf in the case of data distributed from a normal distribution
If the data has been generated from a normal distibution, there is the function cdf():
from scipy.stats import norm
x = np.linspace(-10,10,100)
y = norm.cdf(x)
plt.plot(x, y)
plt.title('How to calculate and plot a cumulative distribution function ?')
plt.savefig("cumulative_density_distribution_04.png", bbox_inches='tight')
plt.close()