Examples of how to drop (remove) dataframe rows that contain NaN with pandas:
Create a dataframe with pandas
Let's consider the following dataframe
import pandas as pdimport numpy as npA = np.random.randint(1,100, size=(10,3))A = A * 1.0n = 6index = np.random.choice(A.size, n, replace=False)A.ravel()[index] = np.nandf = pd.DataFrame(A)print(df)
returns
0 1 20 60.0 42.0 43.01 47.0 87.0 99.02 80.0 44.0 48.03 48.0 NaN 46.04 NaN 90.0 NaN5 99.0 61.0 63.06 NaN 35.0 NaN7 95.0 56.0 13.08 29.0 80.0 52.09 83.0 NaN 87.0
Find rows with NaN
First, to find the indexes of rows with NaN, a solution is to do:
index_with_nan = df.index[df.isnull().any(axis=1)]print(index_with_nan)
which returns here:
Int64Index([3, 4, 6, 9], dtype='int64')
Find the number of NaN per row
It is also possible to get the number of NaNs per row:
print(df.isnull().sum(axis=1))
returns
0 01 02 03 14 25 06 27 08 09 1dtype: int64
Drop rows with NaN
To drop rows with NaN:
df.drop(index_with_nan,0, inplace=True)print(df)
returns
0 1 20 60.0 42.0 43.01 47.0 87.0 99.02 80.0 44.0 48.05 99.0 61.0 63.07 95.0 56.0 13.08 29.0 80.0 52.0
Drop rows with NaN in a given column
Another example, removing rows with NaN in column of index 1:
print( df.iloc[:,1].isnull() )
gives
0 False1 False2 False3 True4 False5 False6 False7 False8 False9 TrueName: 1, dtype: bool
and then
index_with_nan = df.index[df.iloc[:,1].isnull()]df.drop(index_with_nan,0, inplace=True)print(df)
returns
0 1 20 60.0 42.0 43.01 47.0 87.0 99.02 80.0 44.0 48.04 NaN 90.0 NaN5 99.0 61.0 63.06 NaN 35.0 NaN7 95.0 56.0 13.08 29.0 80.0 52.0
