# How to calculate a mean from a dataframe column with pandas in python ?

Published: June 18, 2020

Examples of how to calculate the mean over a dataframe column with pandas in python:

### Create a dataframe

Lets consider the following dataframe:

````import pandas as pd`

`data = {'Name':['Ben','Anna','Zoe','Tom','John','Steve'], `
`        'Age':[20,27,43,30,12,21]}`

`df = pd.DataFrame(data)`
```

returns

````    Name  Age`
`0    Ben   20`
`1   Anna   27`
`2    Zoe   43`
`3    Tom   30`
`4   John   12`
`5  Steve   21`
```

### Calculate the mean

To calculate the mean over the column called above 'Age' a solution is to use mean(), example

````df['Age'].mean()`
```

returns

````25.5`
```

### Another example with a NaN value in the column

````import pandas as pd`
`import numpy as np`

`data = {'Name':['Ben','Anna','Zoe','Tom','John','Steve','Bob'], `
`        'Age':[20,27,43,30,12,21, np.nan]}`

`df = pd.DataFrame(data)`

`    Name   Age`
`0    Ben  20.0`
`1   Anna  27.0`
`2    Zoe  43.0`
`3    Tom  30.0`
`4   John  12.0`
`5  Steve  21.0`
`6    Bob   NaN`

`df['Age'].mean()`
```

returns

````25.5`
```

### Example with normally distributed data

Generate data normally distributed data (mean=27; std=2.0)

````import numpy as np`
`import pandas as pd`

`mu = 27.0`
`sigma = 2.0`

`data = np.random.randn(100000) * sigma + mu`

`df = pd.DataFrame(data, columns=['age'])`

`             age`
`0      31.238531`
`1      28.685002`
`2      27.811728`
`3      25.102273`
`4      23.525331`
`...          ...`
`99995  25.406317`
`99996  25.248491`
`99997  25.555941`
`99998  27.037278`
`99999  27.461417`
```

calculate the mean

````df['age'].mean()`
```

returns

````26.998999150576736`
```

Can be usefull to visualize the distribution of data:

````df['age'].hist()`

`plt.title("How to calculate a column mean with pandas ?")`

`plt.savefig("pandas_column_mean.png", bbox_inches='tight')`
```

Note: if data are censored see how to estimate the mean with a truncated dataset using python for data generated from a normal distribution ?

Image

of