Splitting a Pandas Column of Lists into Multiple Columns
In data exploration, it's often necessary to restructure DataFrame columns into a more manageable format. One such scenario involves splitting a column containing lists into multiple columns.
Consider a DataFrame with a single column named "teams," which holds lists of team names:
import pandas as pd df = pd.DataFrame({ "teams": [[ "SF", "NYG" ] for _ in range(7)] })
To split this "teams" column into two columns, "team1" and "team2," we can leverages the DataFrame constructor with lists created by the to_list method.
Option 1: Modifying Existing DataFrame
Using the to_list method, we can transform the "teams" list into a list of lists, which can be used to create the new "team1" and "team2" columns:
df[['team1', 'team2']] = pd.DataFrame(df['teams'].tolist(), index=df.index)
This operation modifies the original DataFrame with the new columns:
teams team1 team2 0 [SF, NYG] SF NYG 1 [SF, NYG] SF NYG 2 [SF, NYG] SF NYG 3 [SF, NYG] SF NYG 4 [SF, NYG] SF NYG 5 [SF, NYG] SF NYG 6 [SF, NYG] SF NYG
Option 2: Creating a New DataFrame
Alternatively, if desired, we can create a new DataFrame with the split columns:
df3 = pd.DataFrame( df['teams'].tolist(), columns=['team1', 'team2'] )
This operation creates a separate DataFrame:
team1 team2 0 SF NYG 1 SF NYG 2 SF NYG 3 SF NYG 4 SF NYG 5 SF NYG 6 SF NYG
Please note that applying the apply(pd.Series) function to achieve this split is significantly slower and not recommended for larger datasets.
The above is the detailed content of How to Efficiently Split a Pandas Column of Lists into Multiple Columns?. For more information, please follow other related articles on the PHP Chinese website!