# How to retrieve the minimum and maximum dates (i.e., start and end dates) from a datetime column in a pandas dataframe ?

Published: January 06, 2024

## Introduction

In data analysis and manipulation, one of the most important steps is to extract and understand statistical information from a given dataset. This includes finding the maximum and minimum values for a datetime column in Pandas dataframe.

## Create a dataframe by generating Random Dates

In order to learn how to extract the maximum and minimum values from a datetime column in a pandas dataframe, we first need to create a dataframe with random dates. This will allow us to practice our code and see the results in action (see previous articleHow to generate a random date within a specified range using Python ?).

First, let's begin by defining a start and end date in Python using the datetime module:

````from random import randrange`
`from datetime import timedelta`

`import datetime`

`year = 2019`
`month = 8`
`day = 1`

`star_date = datetime.datetime(year,month,day)`

`year = 2019`
`month = 10`
`day = 21`

`end_date = datetime.datetime(year,month,day)`
```

Next, we can proceed to create a function that generates random dates within the specified range:

````def random_date(star_date, end):`
`    delta = end_date - star_date`
`    delta_in_second = (delta.days * 24 * 60 * 60) + delta.seconds`
`    random_second = randrange(delta_in_second)`
`    return star_date + timedelta(seconds=random_second)`
```

Finally, we'll utilize this function to create a synthetic dataframe:

````import pandas as pd`
`import random`

`data = {'date':[random_date(star_date, end_date) for i in range(20)], `
`        'value':[random.randint(0,100) for i in range(20)]}`

`df = pd.DataFrame(data)`
```

The code above will generate for example the following dataframe:

````                  date  value`
`0  2019-09-20 01:25:24     23`
`1  2019-10-10 10:47:01     28`
`2  2019-10-06 05:49:40     32`
`3  2019-09-22 05:14:28     20`
`4  2019-10-05 16:11:36      2`
`5  2019-09-08 02:39:19     41`
`6  2019-09-02 11:52:19     27`
`7  2019-10-17 08:15:55     12`
`8  2019-09-26 05:44:48     29`
`9  2019-09-05 22:47:18     38`
`10 2019-08-18 13:30:46     57`
`11 2019-08-08 18:34:48     47`
`12 2019-10-02 18:37:24     79`
`13 2019-08-22 05:41:41     57`
`14 2019-10-14 12:50:51     92`
`15 2019-09-06 20:24:33     44`
`16 2019-08-21 16:30:04     48`
`17 2019-08-28 16:13:33     77`
`18 2019-10-04 00:00:51     42`
`19 2019-10-12 20:21:25     73`
```

Before manipulating a dataframe, it is always a good idea to check the column types using the dtypes method. This helps ensure accuracy and efficiency in your data operations:

````df.dtypes`
```

returns here

````date     datetime64[ns]`
`value             int64`
`dtype: object`
```

## Extracting Maximum and Minimum Values from a Datetime Column

In Pandas, datetime columns are treated as special type of data called datetime. This allows for efficient manipulation and extraction of information from these columns. To extract the maximum and minimum values from a datetime column in a Pandas dataframe, we can use the max() and min() functions respectively.

To illustrate, to determine the start date, we can utilize the min() function. This function will provide us with a datetime object,

````start_date = df['date'].min()`

`print(start_date)`
```

Output

````Timestamp('2019-08-08 18:34:48')`
```

from which we can subsequently extract specific values such as the year, month, day, hour, or minute, individually.

````print(start_date.year)`
`print(start_date.month)`
`print(start_date.day)`

`print(start_date.hour)`
`print(start_date.minute)`
```

Outputs

````2019`
`8`
`8`
`18`
`34`
```

We can also retrieve the end date by using the "max" function in a similar manner.

````df['date'].max()`
```

Output

````Timestamp('2019-10-17 08:15:55')`
```

## Sorting Rows by DateTime

To organize rows based on date and time, you can utilize the sorting feature. This allows for efficient arrangement of data in a chronological order, facilitating easy analysis and interpretation. To sort rows by datetime, we can use the sort_values() function in Pandas. For example, let's say we want to sort our dataframe based on the 'date' column in ascending order:

````sorted_df = df.sort_values('date')`
`print(sorted_df)`
```

The output will be:

````                  date  value`
`11 2019-08-08 18:34:48     47`
`10 2019-08-18 13:30:46     57`
`16 2019-08-21 16:30:04     48`
`13 2019-08-22 05:41:41     57`
`17 2019-08-28 16:13:33     77`
`6  2019-09-02 11:52:19     27`
`9  2019-09-05 22:47:18     38`
`15 2019-09-06 20:24:33     44`
`5  2019-09-08 02:39:19     41`
`0  2019-09-20 01:25:24     23`
`3  2019-09-22 05:14:28     20`
`8  2019-09-26 05:44:48     29`
`12 2019-10-02 18:37:24     79`
`18 2019-10-04 00:00:51     42`
`4  2019-10-05 16:11:36      2`
`2  2019-10-06 05:49:40     32`
`1  2019-10-10 10:47:01     28`
`19 2019-10-12 20:21:25     73`
`14 2019-10-14 12:50:51     92`
`7  2019-10-17 08:15:55     12`
```

By utilizing the sorted dataframe with the date, we can access the earliest and latest dates using iloc:

````sorted_df.iloc[0]['date']`
```

Output

````Timestamp('2019-08-08 18:34:48')`
```

And

````sorted_df.iloc[-1]['date']`
```

Output

````Timestamp('2019-10-17 08:15:55')`
```