Example of how to copy a data frame with pandas in python:
Create a dataframe
To start let's create a simple dataframe:
>>> import pandas as pd>>> import numpy as np>>> data = np.random.randint(100, size=(10,5))>>> df = pd.DataFrame(data=data,columns=['a','b','c','d','e'])>>> dfa b c d e0 42 94 3 22 281 0 85 93 43 182 70 10 98 19 263 54 72 89 51 614 13 44 94 28 345 79 4 89 33 816 69 37 84 89 597 17 82 84 2 608 79 78 44 0 609 84 2 82 27 27
Create a copy of the dataframe
To create a copy of the dataframe , a solution is to use the pandas function [pandas.DataFrame.copy]:
>>> df2 = df.copy()>>> df2a b c d e0 42 94 3 22 281 0 85 93 43 182 70 10 98 19 263 54 72 89 51 614 13 44 94 28 345 79 4 89 33 816 69 37 84 89 597 17 82 84 2 608 79 78 44 0 609 84 2 82 27 27
Here if a row is changed:
>>> df2.iloc[3,:] = 0>>> df2a b c d e0 42 94 3 22 281 0 85 93 43 182 70 10 98 19 263 0 0 0 0 04 13 44 94 28 345 79 4 89 33 816 69 37 84 89 597 17 82 84 2 608 79 78 44 0 609 84 2 82 27 27
it will not impact the original dataframe:
>>> dfa b c d e0 42 94 3 22 281 0 85 93 43 182 70 10 98 19 263 54 72 89 51 614 13 44 94 28 345 79 4 89 33 816 69 37 84 89 597 17 82 84 2 608 79 78 44 0 609 84 2 82 27 27
Another example by editing two columns:
>>> df2.iloc[:,[2,4]] = 0>>> df2a b c d e0 42 94 0 22 01 0 85 0 43 02 70 10 0 19 03 0 0 0 0 04 13 44 0 28 05 79 4 0 33 06 69 37 0 89 07 17 82 0 2 08 79 78 0 0 09 84 2 0 27 0
the original data frame is not modified:
>>> dfa b c d e0 42 94 3 22 281 0 85 93 43 182 70 10 98 19 263 54 72 89 51 614 13 44 94 28 345 79 4 89 33 816 69 37 84 89 597 17 82 84 2 608 79 78 44 0 609 84 2 82 27 27
One dataframe with multiple names
If the option deep is equal to false:
>>> df3 = df.copy(deep=False)>>> df3.iloc[[0,1,2],:] = 0
it is not really a copy of the data frame, but instead the same data frame with multiple names. So any change of the copy
>>> df3a b c d e0 0 0 0 0 01 0 0 0 0 02 0 0 0 0 03 54 72 89 51 614 13 44 94 28 345 79 4 89 33 816 69 37 84 89 597 17 82 84 2 608 79 78 44 0 609 84 2 82 27 27
will also impact the original data frame:
>>> dfa b c d e0 0 0 0 0 01 0 0 0 0 02 0 0 0 0 03 54 72 89 51 614 13 44 94 28 345 79 4 89 33 816 69 37 84 89 597 17 82 84 2 608 79 78 44 0 609 84 2 82 27 27
Note: Another solution is to use the operator =
>>> df4 = df
which will give the same results as above.
References
| Links | Site |
|---|---|
| pandas.DataFrame.copy | pandas doc |
| How to select one or multiple columns in a pandas DataFrame in python ? | science-emergence.com |
