How to drop (remove) dataframe rows that contain NaN with pandas ?


Examples of how to drop (remove) dataframe rows that contain NaN with pandas:

Create a dataframe with pandas

Let's consider the following dataframe

import pandas as pd
import numpy as np

A = np.random.randint(1,100, size=(10,3))
A = A * 1.0

n = 6

index = np.random.choice(A.size, n, replace=False)

A.ravel()[index] = np.nan

df = pd.DataFrame(A)

print(df)

returns

      0     1     2
0  60.0  42.0  43.0
1  47.0  87.0  99.0
2  80.0  44.0  48.0
3  48.0   NaN  46.0
4   NaN  90.0   NaN
5  99.0  61.0  63.0
6   NaN  35.0   NaN
7  95.0  56.0  13.0
8  29.0  80.0  52.0
9  83.0   NaN  87.0

Find rows with NaN

First, to find the indexes of rows with NaN, a solution is to do:

index_with_nan = df.index[df.isnull().any(axis=1)]

print(index_with_nan)

which returns here:

Int64Index([3, 4, 6, 9], dtype='int64')

Find the number of NaN per row

It is also possible to get the number of NaNs per row:

print(df.isnull().sum(axis=1))

returns

0    0
1    0
2    0
3    1
4    2
5    0
6    2
7    0
8    0
9    1
dtype: int64

Drop rows with NaN

To drop rows with NaN:

df.drop(index_with_nan,0, inplace=True)

print(df)

returns

      0     1     2
0  60.0  42.0  43.0
1  47.0  87.0  99.0
2  80.0  44.0  48.0
5  99.0  61.0  63.0
7  95.0  56.0  13.0
8  29.0  80.0  52.0

Drop rows with NaN in a given column

Another example, removing rows with NaN in column of index 1:

print( df.iloc[:,1].isnull() )

gives

0    False
1    False
2    False
3     True
4    False
5    False
6    False
7    False
8    False
9     True
Name: 1, dtype: bool

and then

index_with_nan = df.index[df.iloc[:,1].isnull()]

df.drop(index_with_nan,0, inplace=True)

print(df)

returns

      0     1     2
0  60.0  42.0  43.0
1  47.0  87.0  99.0
2  80.0  44.0  48.0
4   NaN  90.0   NaN
5  99.0  61.0  63.0
6   NaN  35.0   NaN
7  95.0  56.0  13.0
8  29.0  80.0  52.0

References