Home > Backend Development > Python Tutorial > How to Efficiently Filter DataFrame Rows by Date Range?

How to Efficiently Filter DataFrame Rows by Date Range?

Barbara Streisand
Release: 2024-12-12 16:30:11
Original
1016 people have browsed it

How to Efficiently Filter DataFrame Rows by Date Range?

Query DataFrame Rows Within a Specified Date Range

This question addresses the challenge of extracting rows within a particular date range from a DataFrame containing a date column. The provided solution offers two approaches for achieving this.

Method 1: Utilizing a Boolean Mask

To adopt this method, ensure that 'date' in your DataFrame represents a Series with dtype datetime64[ns]. Employ the following steps:

  1. Create a Boolean Mask: Specify start_date and end_date parameters that can be datetime.datetimes, np.datetime64s, pd.Timestamps, or datetime strings. Construct a boolean mask that evaluates as True for rows that meet the date range criteria.
  2. Select Sub-DataFrame: Use df.loc[mask] to extract the rows that pass the mask condition. Alternatively, to overwrite the existing DataFrame, apply the mask as df = df.loc[mask].

Method 2: Assigning a DatetimeIndex

Optimal for scenarios involving frequent date selections, this approach involves setting the date column as the index:

  1. Set DatetimeIndex: Convert the date column to a DatetimeIndex using df.set_index(['date']).
  2. Select Rows by Date: Leverage df.loc[start_date:end_date] to filter rows based on the date range. Note that both start_date and end_date are inclusive in this selection.

Example:

Utilizing the code provided in the response, consider the following illustration:

import pandas as pd

df = pd.DataFrame({'date': pd.date_range('2023-03-01', periods=10)})
df['value'] = np.random.randn(10)

# Boolean Mask Approach
start_date = '2023-03-03'
end_date = '2023-03-08'
mask = (df['date'] > start_date) & (df['date'] <= end_date)
df_subset = df.loc[mask]

# DatetimeIndex Approach
df = df.set_index('date')
df_subset = df.loc[start_date:end_date]
Copy after login

This would yield two DataFrames that contain rows corresponding to the specified date range.

The above is the detailed content of How to Efficiently Filter DataFrame Rows by Date Range?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template