Scenario:
Data within a Pandas DataFrame often exists in various formats, including strings. When working with temporal data, timestamps may initially appear as strings but need to be converted to a datetime format for accurate analysis.
Conversion and Filtering Based on Date
To convert a string column to datetime in Pandas, utilize the to_datetime function. This function takes a format argument that specifies the expected format of the string column.
Example:
Consider the following DataFrame with a column (Mycol) containing strings in a custom format:
import pandas as pd raw_data = pd.DataFrame({'Mycol': ['05SEP2014:00:00:00.000']})
To convert this column to datetime, use the following code:
df['Mycol'] = pd.to_datetime(df['Mycol'], format='%d%b%Y:%H:%M:%S.%f')
The format argument specified matches the given string format. After conversion, the Mycol column will now contain datetime objects.
Date-Based Filtering
Once the column is converted to datetime, you can perform date-based filtering operations. For example, to select rows whose date falls within a specific range:
start_date = '01SEP2014' end_date = '30SEP2014' filtered_df = df[(df['Mycol'] >= pd.to_datetime(start_date)) & (df['Mycol'] <= pd.to_datetime(end_date))]
The resulting filtered_df will include only the rows where the Mycol column value is between the specified dates.
The above is the detailed content of How to Convert a Pandas DataFrame Column to DateTime Format and Filter by Date?. For more information, please follow other related articles on the PHP Chinese website!