Counting the number of non-NaN values in every row of a pandas Dataframe can easily be done by using the count() function. Example:
Case study
import pandas as pd
import numpy as np
np.random.seed(42)
data = np.random.uniform(10,80, size=(4,6))
n = 15
index = np.random.choice(data.size, n, replace=False)
data.ravel()[index] = np.nan
df = pd.DataFrame(data,columns=['2023-02-01','2023-02-02','2023-02-03','2023-02-04','2023-02-05','2023-02-06'])
Output
2023-02-01 2023-02-02 2023-02-03 2023-02-04 2023-02-05 2023-02-06
0 NaN NaN 61.239576 NaN 20.921305 NaN
1 14.065853 NaN 52.078051 NaN NaN 77.89369
2 NaN NaN 22.727748 NaN NaN 46.73295
3 40.236151 30.38604 NaN NaN NaN NaN
Count the number of non-NaN values in every row
To use the pandas count() function, simply pass in your Dataframe as an argument and the count() function will return the number of non-NaN values for each row:
res = df.count(axis=1)
print(res)
Ouput
0 2
1 3
2 2
3 2
dtype: int64
The count function returns a pandas series:
type(res)
Ouput
pandas.core.series.Series
Get the values:
res.values
Output
array([2, 3, 2, 2])
Count only for specific columns
This can be further refined by including a parameter for only counting specific columns from the count. For example, if you wanted to only count column a and b, you could use the following code:
df[['2023-02-01','2023-02-02']].count(axis=1)
output
0 0
1 1
2 0
3 2
dtype: int64
References
Links | Site |
---|---|
pandas.DataFrame.count | pandas.pydata.org |
How to count the number of NaN in a pandas dataframe column ? | moonbooks.org |
How to randomly insert NaN in a matrix with numpy in python ? | moonbooks.org |