將Pandas 資料幀與日期範圍條件合併
要合併兩個pandas 資料幀,其中一個值介於其他兩個值之間,一種常見的方法是合併資料幀無條件地基於標識符,然後根據日期條件進行篩選。但是,此方法可能會導致記憶體效率低下。
利用SQL 強大功能的替代解決方案是在SQL 查詢中執行合併和過濾,如下所示:
<code class="python">import pandas as pd import sqlite3 # Create dataframes presidents = pd.DataFrame({"name": ["Bush", "Obama", "Trump"], "president_id":[43, 44, 45]}) terms = pd.DataFrame({'start_date': pd.date_range('2001-01-20', periods=5, freq='48M'), 'end_date': pd.date_range('2005-01-21', periods=5, freq='48M'), 'president_id': [43, 43, 44, 44, 45]}) war_declarations = pd.DataFrame({"date": [datetime(2001, 9, 14), datetime(2003, 3, 3)], "name": ["War in Afghanistan", "Iraq War"]}) # Create an in-memory database conn = sqlite3.connect(':memory:') # Write dataframes to database terms.to_sql('terms', conn, index=False) presidents.to_sql('presidents', conn, index=False) war_declarations.to_sql('wars', conn, index=False) # Execute SQL query to merge and filter dataframes qry = ''' select start_date PresTermStart, end_date PresTermEnd, wars.date WarStart, presidents.name Pres from terms join wars on date between start_date and end_date join presidents on terms.president_id = presidents.president_id ''' df = pd.read_sql_query(qry, conn) print(df)</code>
輸出:
PresTermStart PresTermEnd WarStart Pres 0 2001-01-31 00:00:00 2005-01-31 00:00:00 2001-09-14 00:00:00 Bush 1 2001-01-31 00:00:00 2005-01-31 00:00:00 2003-03-03 00:00:00 Bush
透過在SQL 中執行合併和過濾,我們可以避免創建潛在的大型中間資料幀,從而提高記憶體效率。
以上是如何有效地將 Pandas DataFrame 與日期範圍條件合併?的詳細內容。更多資訊請關注PHP中文網其他相關文章!