Joining Dataframes with Overlapping Ranges Using Interval Indexing
Given two dataframes, df_1 and df_2, with a common column that represents a datetime range, we aim to join them using a specific condition: df_1's datetime column values must fall within the ranges specified in df_2.
df_1 timestamp A B 0 2016-05-14 10:54:33 0.020228 0.026572 1 2016-05-14 10:54:34 0.057780 0.175499 2 2016-05-14 10:54:35 0.098808 0.620986 3 2016-05-14 10:54:36 0.158789 1.014819 4 2016-05-14 10:54:39 0.038129 2.384590 df_2 start end event 0 2016-05-14 10:54:31 2016-05-14 10:54:33 E1 1 2016-05-14 10:54:34 2016-05-14 10:54:37 E2 2 2016-05-14 10:54:38 2016-05-14 10:54:42 E3
Solution:
We can use interval indexing to achieve this. Interval indexing creates bins based on the ranges specified in df_2 and assigns labels to timestamps in df_1 that fall within those bins.
import pandas as pd # Convert start and end columns to IntervalIndex df_2.index = pd.IntervalIndex.from_arrays(df_2['start'], df_2['end'], closed='both') # Get the event associated with each timestamp in df_1 df_1['event'] = df_1['timestamp'].apply(lambda x: df_2.iloc[df_2.index.get_loc(x)]['event'])
Output:
timestamp A B event 0 2016-05-14 10:54:33 0.020228 0.026572 E1 1 2016-05-14 10:54:34 0.057780 0.175499 E2 2 2016-05-14 10:54:35 0.098808 0.620986 E2 3 2016-05-14 10:54:36 0.158789 1.014819 E2 4 2016-05-14 10:54:39 0.038129 2.384590 E3
The above is the detailed content of How to Join DataFrames with Overlapping Datetime Ranges Using Interval Indexing?. For more information, please follow other related articles on the PHP Chinese website!