Example of how to copy a data frame with pandas in python:
Create a dataframe
To start let's create a simple dataframe:
>>> import pandas as pd
>>> import numpy as np
>>> data = np.random.randint(100, size=(10,5))
>>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e'])
>>> df
a b c d e
0 42 94 3 22 28
1 0 85 93 43 18
2 70 10 98 19 26
3 54 72 89 51 61
4 13 44 94 28 34
5 79 4 89 33 81
6 69 37 84 89 59
7 17 82 84 2 60
8 79 78 44 0 60
9 84 2 82 27 27
Create a copy of the dataframe
To create a copy of the dataframe , a solution is to use the pandas function [pandas.DataFrame.copy]:
>>> df2 = df.copy()
>>> df2
a b c d e
0 42 94 3 22 28
1 0 85 93 43 18
2 70 10 98 19 26
3 54 72 89 51 61
4 13 44 94 28 34
5 79 4 89 33 81
6 69 37 84 89 59
7 17 82 84 2 60
8 79 78 44 0 60
9 84 2 82 27 27
Here if a row is changed:
>>> df2.iloc[3,:] = 0
>>> df2
a b c d e
0 42 94 3 22 28
1 0 85 93 43 18
2 70 10 98 19 26
3 0 0 0 0 0
4 13 44 94 28 34
5 79 4 89 33 81
6 69 37 84 89 59
7 17 82 84 2 60
8 79 78 44 0 60
9 84 2 82 27 27
it will not impact the original dataframe:
>>> df
a b c d e
0 42 94 3 22 28
1 0 85 93 43 18
2 70 10 98 19 26
3 54 72 89 51 61
4 13 44 94 28 34
5 79 4 89 33 81
6 69 37 84 89 59
7 17 82 84 2 60
8 79 78 44 0 60
9 84 2 82 27 27
Another example by editing two columns:
>>> df2.iloc[:,[2,4]] = 0
>>> df2
a b c d e
0 42 94 0 22 0
1 0 85 0 43 0
2 70 10 0 19 0
3 0 0 0 0 0
4 13 44 0 28 0
5 79 4 0 33 0
6 69 37 0 89 0
7 17 82 0 2 0
8 79 78 0 0 0
9 84 2 0 27 0
the original data frame is not modified:
>>> df
a b c d e
0 42 94 3 22 28
1 0 85 93 43 18
2 70 10 98 19 26
3 54 72 89 51 61
4 13 44 94 28 34
5 79 4 89 33 81
6 69 37 84 89 59
7 17 82 84 2 60
8 79 78 44 0 60
9 84 2 82 27 27
One dataframe with multiple names
If the option deep is equal to false:
>>> df3 = df.copy(deep=False)
>>> df3.iloc[[0,1,2],:] = 0
it is not really a copy of the data frame, but instead the same data frame with multiple names. So any change of the copy
>>> df3
a b c d e
0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 54 72 89 51 61
4 13 44 94 28 34
5 79 4 89 33 81
6 69 37 84 89 59
7 17 82 84 2 60
8 79 78 44 0 60
9 84 2 82 27 27
will also impact the original data frame:
>>> df
a b c d e
0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 54 72 89 51 61
4 13 44 94 28 34
5 79 4 89 33 81
6 69 37 84 89 59
7 17 82 84 2 60
8 79 78 44 0 60
9 84 2 82 27 27
Note: Another solution is to use the operator =
>>> df4 = df
which will give the same results as above.
References
Links | Site |
---|---|
pandas.DataFrame.copy | pandas doc |
How to select one or multiple columns in a pandas DataFrame in python ? | science-emergence.com |