Home > Backend Development > Python Tutorial > How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?

How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?

Patricia Arquette
Release: 2024-12-13 13:11:09
Original
299 people have browsed it

How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?

Selecting DataFrame Rows Within a Date Range

In Python, DataFrames are a powerful tool for storing and manipulating tabular data. One commonly encountered task is filtering rows based on a specified date range. To accomplish this, we have the following options:

Method 1: Using a Boolean Mask

If your DataFrame contains a date column with datetime64[ns] data type, you can create a boolean mask to select rows that fall within the desired range:

# Ensure date column is a datetime64 series
df['date'] = pd.to_datetime(df['date']) 

# Create a boolean mask
mask = (df['date'] > start_date) & (df['date'] <= end_date)

# Select the sub-DataFrame
sub_df = df.loc[mask]
Copy after login

Method 2: Setting a DatetimeIndex

Another efficient approach is to set the date column as the index of the DataFrame, creating a DatetimeIndex:

df = df.set_index(['date'])

# Select rows using index slicing
sub_df = df.loc[start_date:end_date]
Copy after login

This method is particularly useful for frequent date-based selections, as it provides faster performance compared to using a boolean mask.

Example:

Consider the following DataFrame:

>>> df
            value  date
0          0.2   2021-06-01
1          0.3   2021-06-05
2          0.4   2021-06-10
3          0.5   2021-06-15
Copy after login

To select rows for the month of June 2021, we can use either method:

Method 1: Boolean Mask

mask = (df['date'] > '2021-06-01') & (df['date'] <= '2021-06-30')
sub_df = df.loc[mask]
Copy after login

Method 2: DatetimeIndex

df = df.set_index(['date'])
sub_df = df.loc['2021-06-01':'2021-06-30']
Copy after login

Both methods will return the following sub-DataFrame:

>>> sub_df
            value  date
0          0.2   2021-06-01
1          0.3   2021-06-05
2          0.4   2021-06-10
3          0.5   2021-06-15
Copy after login

The above is the detailed content of How to Efficiently Select DataFrame Rows Within a Specific Date Range in Python?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template