How to loop through the initial n rows of a pandas DataFrame in python ?

Published: November 29, 2022

Updated: February 20, 2023

Tags: Python; Pandas; Dataframe;

DMCA.com Protection Status

One of the most common tasks when working with data in Python is to iterate over the first n rows of a Pandas DataFrame. There are several ways to accomplish this goal, depending on what you need to do.

Synthetic data

To start, let's generate a DataFrame using synthetic data:

import pandas as pd
import numpy as np

data = np.arange(1,31)
data = data.reshape(10,3)

df = pd.DataFrame(data, columns=['A','B','C'])

print(df)

The code displayed above will generate:

    A   B   C
0   1   2   3
1   4   5   6
2   7   8   9
3  10  11  12
4  13  14  15
5  16  17  18
6  19  20  21
7  22  23  24
8  25  26  27
9  28  29  30

Select first n rows

Using head()

A first solution is to use the pandas head():

n = 4

df.head(n)

will display here the first 4 rows:

    A   B   C
0   1   2   3
1   4   5   6
2   7   8   9
3  10  11  12

Using iloc()

Another solution is to use iloc()

df.iloc[:n,:]

The code displayed above will generate:

    A   B   C
0   1   2   3
1   4   5   6
2   7   8   9
3  10  11  12

Loop through the initial n rows using iterrows()

Utilizing iterrows(), we can cycle through the initial n rows of data easily and efficiently:

    for index,row in df.head(4).iterrows():
        print(index)
        print(row)
        print()

The code displayed above will then generate:

0
A    1
B    2
C    3
Name: 0, dtype: int64

1
A    4
B    5
C    6
Name: 1, dtype: int64

2
A    7
B    8
C    9
Name: 2, dtype: int64

3
A    10
B    11
C    12
Name: 3, dtype: int64

Same output with

    for index,row in df.iloc[:4,:].iterrows():
        print(index)
        print(row)
        print()

Loop through the initial n rows using itertuples()

Another way to iterate over the first n rows is to use the itertuples() method. This method returns an iterator that contains tuples of each row, which you can then loop through as needed. For example, if you need to iterate over the first three rows of a DataFrame named ‘df’, you would use this code:

for row in df.iloc[:3,:].itertuples(index=False, name=None):
    print(row)

returns

 (1, 2, 3)
 (4, 5, 6)
 (7, 8, 9)
 (10, 11, 12)

References

Links Site
head() pandas.pydata.org
iloc() pandas.pydata.org
iterrows() pandas.pydata.org
itertuples() pandas.pydata.org