Home > Backend Development > Python Tutorial > How to Efficiently Select DataFrame Rows Within a Specific Date Range in Pandas?

How to Efficiently Select DataFrame Rows Within a Specific Date Range in Pandas?

Patricia Arquette
Release: 2024-12-14 08:36:16
Original
411 people have browsed it

How to Efficiently Select DataFrame Rows Within a Specific Date Range in Pandas?

Select DataFrame Rows Between Two Dates

Introduction

When working with time-series data, it is often necessary to select specific rows based on date ranges. This article explores two methods for achieving this in pandas DataFrames.

Method 1: Boolean Mask

  1. Ensure the date column is a Series with dtype datetime64[ns]:

    df['date'] = pd.to_datetime(df['date'])
    Copy after login
  2. Create a boolean mask using comparison operators with the start and end dates:

    mask = (df['date'] > start_date) & (df['date'] <= end_date)
    Copy after login
  3. Select the sub-DataFrame using the mask:

    df.loc[mask]
    Copy after login
  4. Optionally, re-assign the sub-DataFrame to df.

Method 2: DatetimeIndex

  1. Set the date column as the index:

    df = df.set_index(['date'])
    Copy after login
  2. Slice the DataFrame using date ranges:

    df.loc[start_date:end_date]
    Copy after login

Example

Consider a DataFrame with a date column. The following code uses the boolean mask method to select rows between '2000-06-01' and '2000-06-10':

import pandas as pd

df = pd.DataFrame({
    'date': pd.date_range('2000-1-1', periods=200, freq='D'),
    'value': np.random.rand(200)
})

mask = (df['date'] > '2000-06-01') & (df['date'] <= '2000-06-10')
result_df = df[mask]
Copy after login

The result includes rows from June 1st to 10th, 2000.

Comparison

  • The boolean mask method is more flexible and allows for more complex date comparisons.
  • The DatetimeIndex method is faster for repetitive date range selections.
  • Using parse_dates in pd.read_csv can save the need for converting the date column to datetime64s.

The above is the detailed content of How to Efficiently Select DataFrame Rows Within a Specific Date Range in Pandas?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template