How to copy a dataframe with pandas in python ?

Published: November 14, 2019

DMCA.com Protection Status

Example of how to copy a data frame with pandas in python:

Create a dataframe

To start let's create a simple dataframe:

>>> import pandas as pd
>>> import numpy as np
>>> data = np.random.randint(100, size=(10,5))
>>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e'])
>>> df
    a   b   c   d   e
0  42  94   3  22  28
1   0  85  93  43  18
2  70  10  98  19  26
3  54  72  89  51  61
4  13  44  94  28  34
5  79   4  89  33  81
6  69  37  84  89  59
7  17  82  84   2  60
8  79  78  44   0  60
9  84   2  82  27  27

Create a copy of the dataframe

To create a copy of the dataframe , a solution is to use the pandas function [pandas.DataFrame.copy]:

>>> df2 = df.copy()
>>> df2
    a   b   c   d   e
0  42  94   3  22  28
1   0  85  93  43  18
2  70  10  98  19  26
3  54  72  89  51  61
4  13  44  94  28  34
5  79   4  89  33  81
6  69  37  84  89  59
7  17  82  84   2  60
8  79  78  44   0  60
9  84   2  82  27  27

Here if a row is changed:

>>> df2.iloc[3,:] = 0
>>> df2
    a   b   c   d   e
0  42  94   3  22  28
1   0  85  93  43  18
2  70  10  98  19  26
3   0   0   0   0   0
4  13  44  94  28  34
5  79   4  89  33  81
6  69  37  84  89  59
7  17  82  84   2  60
8  79  78  44   0  60
9  84   2  82  27  27

it will not impact the original dataframe:

>>> df
    a   b   c   d   e
0  42  94   3  22  28
1   0  85  93  43  18
2  70  10  98  19  26
3  54  72  89  51  61
4  13  44  94  28  34
5  79   4  89  33  81
6  69  37  84  89  59
7  17  82  84   2  60
8  79  78  44   0  60
9  84   2  82  27  27

Another example by editing two columns:

>>> df2.iloc[:,[2,4]] = 0
>>> df2
    a   b  c   d  e
0  42  94  0  22  0
1   0  85  0  43  0
2  70  10  0  19  0
3   0   0  0   0  0
4  13  44  0  28  0
5  79   4  0  33  0
6  69  37  0  89  0
7  17  82  0   2  0
8  79  78  0   0  0
9  84   2  0  27  0

the original data frame is not modified:

>>> df
    a   b   c   d   e
0  42  94   3  22  28
1   0  85  93  43  18
2  70  10  98  19  26
3  54  72  89  51  61
4  13  44  94  28  34
5  79   4  89  33  81
6  69  37  84  89  59
7  17  82  84   2  60
8  79  78  44   0  60
9  84   2  82  27  27

One dataframe with multiple names

If the option deep is equal to false:

>>> df3 = df.copy(deep=False)
>>> df3.iloc[[0,1,2],:] = 0

it is not really a copy of the data frame, but instead the same data frame with multiple names. So any change of the copy

>>> df3
    a   b   c   d   e
0   0   0   0   0   0
1   0   0   0   0   0
2   0   0   0   0   0
3  54  72  89  51  61
4  13  44  94  28  34
5  79   4  89  33  81
6  69  37  84  89  59
7  17  82  84   2  60
8  79  78  44   0  60
9  84   2  82  27  27

will also impact the original data frame:

>>> df
    a   b   c   d   e
0   0   0   0   0   0
1   0   0   0   0   0
2   0   0   0   0   0
3  54  72  89  51  61
4  13  44  94  28  34
5  79   4  89  33  81
6  69  37  84  89  59
7  17  82  84   2  60
8  79  78  44   0  60
9  84   2  82  27  27

Note: Another solution is to use the operator =

>>> df4 = df

which will give the same results as above.

References