An example of how to calculate a root mean square using python in the case of a linear regression model:
\begin{equation}
y = \theta_1 x + \theta_0
\end{equation}
Plot the data
Let's generate an ensemble of data with:
\begin{equation}
y = 3x + 2
\end{equation}
import matplotlib.pyplot as pltimport numpy as npX = 4 * np.random.rand(1000,1)X_b = np.c_[np.ones((1000,1)), X]Y = 2 + 3 * X + np.random.randn(1000,1)plt.plot(X,Y,'.')plt.xlim(0,4)plt.ylim(0,15)plt.xlabel(r'x',fontsize=8)plt.ylabel(r'y',fontsize=8)plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.savefig("mean_squared_error_01.png", bbox_inches='tight')

Linear model
Let's now consider the following linear model:
\begin{equation}
y = \theta_1 x + \theta_0
\end{equation}
with $\theta_0=-1.4$ et $\theta_1=5.0$
#----- Let's take one random linear modeltheta = np.array([[-1.4],[5.0]])X_new = np.array([[0],[4]])X_new_b = np.c_[np.ones((2,1)), X_new]plt.plot(X_new, X_new_b.dot( theta ), '-')plt.xlim(0,4)plt.ylim(0,15)plt.xlabel(r'x',fontsize=8)plt.ylabel(r'y',fontsize=8)plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.savefig("mean_squared_error_02.png", bbox_inches='tight')plt.close()

Calculate the root mean square
The root mean square can be then calculated in python:
\begin{equation}
mse = \frac{1}{m} \sum_{i=1}^{m}(\theta^T.\textbf{x}^{(i)}-y^{(i)})^2
\end{equation}
Y_predict = X_b.dot( theta )print(Y_predict.shape, X_b.shape, theta.shape)mse = np.sum( (Y_predict-Y)**2 ) / 1000.0print('mse: ', mse)
Another solution is to use the python module sklearn:
from sklearn.metrics import mean_squared_errorprint('mse (sklearn): ', mean_squared_error(Y,Y_predict))
returns for example
mse: 6.75308540424mse (sklearn): 6.75308540424
Calculate the root mean square for an ensemble of linear models
An example of how to calculate the root mean square for an ensemble of linear models (grid search over $\theta_0$ and $\theta_1$):
#----- Calculate the mse using a grid searchtheta_0, theta_1 = np.meshgrid(np.arange(0, 10, 0.1), np.arange(0, 10, 0.1))theta = np.vstack((theta_0.ravel(), theta_1.ravel()))Y_predict = X_b @ thetamse = np.sum( (Y_predict-Y)**2, axis=0 ) / 1000.0mse = mse.reshape(100,100)from matplotlib.colors import LogNormfrom pylab import figure, cmplt.imshow(mse, origin='lower', norm=LogNorm(), extent=[0,10,0,10], cmap=cm.jet)plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.xlabel(r'$\theta_0$',fontsize=8)plt.ylabel(r'$\theta_1$',fontsize=8)plt.savefig("mean_squared_error_03.png", bbox_inches='tight')#plt.show()plt.close()
One can see that the linear model that minimize the root mean square is around $\theta_0=2$ and $\theta_1=3$.

We can also plot the variation of the mse versus $\theta_1$ for a given $\theta_0$:
plt.plot(mse[:,20])plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.xlabel(r'$\theta_1$',fontsize=8)plt.ylabel(r'mean square error',fontsize=8)positions = [i*10 for i in range(10)]labels = [i for i in range(10)]plt.xticks(positions, labels)plt.grid(linestyle='--')plt.savefig("mean_squared_error_04.png", bbox_inches='tight')#plt.show()

Source code
import matplotlib.pyplot as pltimport numpy as npX = 4 * np.random.rand(1000,1)X_b = np.c_[np.ones((1000,1)), X]Y = 2 + 3 * X + np.random.randn(1000,1)plt.plot(X,Y,'.')plt.xlim(0,4)plt.ylim(0,15)plt.xlabel(r'x',fontsize=8)plt.ylabel(r'y',fontsize=8)plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.savefig("mean_squared_error_01.png", bbox_inches='tight')#----- Let's take one random linear modeltheta = np.array([[-1.4],[5.0]])X_new = np.array([[0],[4]])X_new_b = np.c_[np.ones((2,1)), X_new]plt.plot(X_new, X_new_b.dot( theta ), '-')plt.xlim(0,4)plt.ylim(0,15)plt.xlabel(r'x',fontsize=8)plt.ylabel(r'y',fontsize=8)plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.savefig("mean_squared_error_02.png", bbox_inches='tight')plt.close()#----- using pythonY_predict = X_b.dot( theta )print(Y_predict.shape, X_b.shape, theta.shape)mse = np.sum( (Y_predict-Y)**2 ) / 1000.0print('mse: ', mse)#----- using sklearnfrom sklearn.metrics import mean_squared_errorprint('mse (sklearn): ', mean_squared_error(Y,Y_predict))#----- Calculate the mse using a grid searchtheta_0, theta_1 = np.meshgrid(np.arange(0, 10, 0.1), np.arange(0, 10, 0.1))theta = np.vstack((theta_0.ravel(), theta_1.ravel()))Y_predict = X_b @ thetamse = np.sum( (Y_predict-Y)**2, axis=0 ) / 1000.0mse = mse.reshape(100,100)from matplotlib.colors import LogNormfrom pylab import figure, cmplt.imshow(mse, origin='lower', norm=LogNorm(), extent=[0,10,0,10], cmap=cm.jet)plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.xlabel(r'$\theta_0$',fontsize=8)plt.ylabel(r'$\theta_1$',fontsize=8)plt.savefig("mean_squared_error_03.png", bbox_inches='tight')#plt.show()plt.close()#----- plot theta_1 for a given theta_0plt.plot(mse[:,20])plt.title('How to caclulate the mean squared error in python ?',fontsize=8)plt.xlabel(r'$\theta_1$',fontsize=8)plt.ylabel(r'mean square error',fontsize=8)positions = [i*10 for i in range(10)]labels = [i for i in range(10)]plt.xticks(positions, labels)plt.grid(linestyle='--')plt.savefig("mean_squared_error_04.png", bbox_inches='tight')#plt.show()
References
| Links | Site |
|---|---|
| sklearn.metrics.mean_squared_error | scikit-learn.org |
| Régression linéaire | wikipedia |
| What is the purpose of meshgrid in Python / NumPy? | stackoverflow |
| Erreur quadratique moyenne | wikipedia |
| How to merge mesh grid points from two rectangles in python? | stackoverflow |
