Get the absolute difference between two pandas dataframe columns

Published: February 19, 2023

Tags: Python; Pandas; Dataframe;

DMCA.com Protection Status

The absolute difference between two pandas DataFrame columns can be calculated using the abs() function. Examples:

Synthetic data

To start, let's generate a DataFrame using synthetic data:

    import pandas as pd
    import numpy as np

    np.random.seed(42)

    a = -100
    b = 100

    data = np.random.random_sample((15,3)) * (b-a) + a

    df = pd.DataFrame(data=data, columns=['A','B','C'])

The code displayed above will generate for example:

                A          B          C
    0  -25.091976  90.142861  46.398788
    1   19.731697 -68.796272 -68.801096
    2  -88.383278  73.235229  20.223002
    3   41.614516 -95.883101  93.981970
    4   66.488528 -57.532178 -63.635007
    5  -63.319098 -39.151551   4.951286
    6  -13.610996 -41.754172  22.370579
    7  -72.101228 -41.571070 -26.727631
    8   -8.786003  57.035192 -60.065244
    9    2.846888  18.482914 -90.709917
    10  21.508970 -65.895175 -86.989681
    11  89.777107  93.126407  61.679470
    12 -39.077246 -80.465577  36.846605
    13 -11.969501 -75.592353  -0.964618
    14 -93.122296  81.864080 -48.244004

Using pandas absolute abs() function

To compute the absolute difference between two pandas DataFrame columns, you can use the abs() function, example:

df['A'].abs() - df['B'].abs()

This will return a new column with the absolute difference between the two columns.

    0    -65.050885
    1    -49.064575
    2     15.148048
    3    -54.268586
    4      8.956350
    5     24.167547
    6    -28.143176
    7     30.530158
    8    -48.249189
    9    -15.636026
    10   -44.386205
    11    -3.349299
    12   -41.388331
    13   -63.622852
    14    11.258215
    dtype: float64

Note: to get the absolute of the difference between two pandas DataFrame columns, a solution is to do

( df['A'] - df['B'] ).abs()

This will then return a new column:

0     115.234838
1      88.527969
2     161.618507
3     137.497617
4     124.020706
5      24.167547
6      28.143176
7      30.530158
8      65.821195
9      15.636026
10     87.404146
11      3.349299
12     41.388331
13     63.622852
14    174.986376
dtype: float64

Additional features

Save result in a new column

    df['Abs Diff A & B'] = df['A'].abs() - df['B'].abs()

This will then create a new column in the dataframe df:

                A          B          C  Abs Diff A & B
    0  -25.091976  90.142861  46.398788      -65.050885
    1   19.731697 -68.796272 -68.801096      -49.064575
    2  -88.383278  73.235229  20.223002       15.148048
    3   41.614516 -95.883101  93.981970      -54.268586
    4   66.488528 -57.532178 -63.635007        8.956350
    5  -63.319098 -39.151551   4.951286       24.167547
    6  -13.610996 -41.754172  22.370579      -28.143176
    7  -72.101228 -41.571070 -26.727631       30.530158
    8   -8.786003  57.035192 -60.065244      -48.249189
    9    2.846888  18.482914 -90.709917      -15.636026
    10  21.508970 -65.895175 -86.989681      -44.386205
    11  89.777107  93.126407  61.679470       -3.349299
    12 -39.077246 -80.465577  36.846605      -41.388331
    13 -11.969501 -75.592353  -0.964618      -63.622852
    14 -93.122296  81.864080 -48.244004       11.258215

Round absolute values

It is also possible to round a dataframe column using round() function:

df['Abs Diff A & B'].round(2)

this will give:

    0    -65.05
    1    -49.06
    2     15.15
    3    -54.27
    4      8.96
    5     24.17
    6    -28.14
    7     30.53
    8    -48.25
    9    -15.64
    10   -44.39
    11    -3.35
    12   -41.39
    13   -63.62
    14    11.26
    Name: Abs Diff A & B, dtype: float64

References