如何對資料框進行透視?
什麼是透視?
透視是一種用於透過交換行和列來重塑 DataFrame 的資料轉換技術。它通常用於以更易於分析或可視化的方式組織資料。
如何進行資料透視?
有多種方法可以在其中透視DataFrame使用Pandas 函式庫的Python:
1. pd.DataFrame.pivot_table:
1. pd.DataFrame.pivot_table:import pandas as pd # Create a sample DataFrame df = pd.DataFrame({ "row": ["row0", "row1", "row2", "row3", "row4"], "col": ["col0", "col1", "col2", "col3", "col4"], "val0": [0.81, 0.44, 0.77, 0.15, 0.81], "val1": [0.04, 0.07, 0.01, 0.59, 0.64] }) # Pivot the DataFrame using pivot_table df_pivoted = df.pivot_table( index="row", columns="col", values="val0", aggfunc="mean", ) print(df_pivoted) # Output: col0 col1 col2 col3 col4 row row0 0.77 0.445 0.000 0.860 0.650 row1 0.130 0.000 0.395 0.500 0.250 row2 0.000 0.310 0.000 0.545 0.000 row3 0.000 0.100 0.395 0.760 0.240 row4 0.000 0.000 0.000 0.000 0.000
# Group the DataFrame by row and col df_grouped = df.groupby(["row", "col"]) # Perform pivot using unstack df_pivoted = df_grouped["val0"].unstack(fill_value=0) print(df_pivoted) # Output: col col0 col1 col2 col3 col4 row row0 0.81 0.445 0.000 0.860 0.650 row1 0.130 0.000 0.395 0.500 0.250 row2 0.000 0.310 0.000 0.545 0.000 row3 0.000 0.100 0.395 0.760 0.240 row4 0.000 0.000 0.000 0.000 0.000
範例:
# Set the row and col as the DataFrame's index df = df.set_index(["row", "col"]) # Perform pivot using unstack df_pivoted = df["val0"].unstack(fill_value=0) print(df_pivoted) # Output: col col0 col1 col2 col3 col4 row row0 0.81 0.445 0.000 0.860 0.650 row1 0.130 0.000 0.395 0.500 0.250 row2 0.000 0.310 0.000 0.545 0.000 row3 0.000 0.100 0.395 0.760 0.240 row4 0.000 0.000 0.000 0.000 0.000
2. pd.DataFrame.groupby pd.DataFrame.unstack:
此方法涉及按所需的行和列索引將DataFrame 分組,然後使用unstack 來旋轉分組的資料。 範例:# Perform pivot using pivot df_pivoted = df.pivot(index="row", columns="col") print(df_pivoted) # Output: col col0 col1 col2 col3 col4 row row0 key0 0.81 0.44 0.00 0.86 0.65 row1 key1 0.13 0.00 0.39 0.50 0.25 row2 key1 0.00 0.31 0.00 0.54 0.00 row3 key0 0.00 0.10 0.39 0.76 0.24 row4 key1 0.00 0.00 0.00 0.00 0.00
3. pd.DataFrame.set_index pd.DataFrame.unstack:
此方法涉及將所需的行和列索引設定為DataFrame 的索引,然後使用unstack 來旋轉資料。範例:
4. pd.DataFrame.pivot:df["Combined"] = df["row"] + "|" + df["col"] df_pivoted = df.pivot(index="Combined", columns="A", values="B") print(df_pivoted) # Output: A a b c Combined row0|col0 0.0 10.0 7.0 row1|col1 11.0 10.0 NaN row2|col2 2.0 14.0 NaN row3|col3 11.0 NaN NaN row4|col4 NaN NaN NaN
與此方法提供了更簡單的語法,但功能有限。它只允許您指定行索引和列索引,並且不能執行聚合。
範例:df["Combined"] = df["row"] + "|" + df["col"] df_grouped = df.groupby(["Combined", "A"]) df_pivoted = df_grouped["B"].unstack(fill_value=0) print(df_pivoted) # Output: A a b c Combined row0|col0 0.0 10.0 7.0 row1|col1 11.0 10.0 NaN row2|col2 2.0 14.0 NaN row3|col3 11.0 NaN NaN row4|col4 NaN NaN NaN
長格式轉寬格式
df_pivoted.columns = df_pivoted.columns.map("|".join) print(df_pivoted) # Output: a|col0 b|col0 c|col0 a|col1 b|col1 c|col1 a|col2 b|col2 c|col2 a|col3 b|col3 c|col3 row row0 0.0 10.0 7.0 11.0 10.0 NaN 2.0 14.0 NaN 11.0 NaN NaN row1 0.0 10.0 7.0 11.0 10.0 NaN 2.0 14.0 NaN 11.0 NaN NaN
以上是如何在 Python 中透視 Pandas DataFrame?的詳細內容。更多資訊請關注PHP中文網其他相關文章!