Splitting a Column of Tuples in a Pandas DataFrame
In Pandas dataframes, splitting a column containing tuples into multiple columns is a common operation. To achieve this, one can adopt the following methods:
Using pd.DataFrame(col.tolist())
This method converts the tuple column into a list of tuples and then creates a new dataframe from it. The index of the new dataframe matches that of the original.
<code class="python">import pandas as pd # Create a dataframe with a column containing tuples df = pd.DataFrame({'a': [1, 2], 'b': [(1, 2), (3, 4)]}) # Split the 'b' column into 'b1' and 'b2' df[['b1', 'b2']] = pd.DataFrame(df['b'].tolist(), index=df.index) # Print the resulting dataframe print(df)</code>
Output:
a b b1 b2 0 1 (1, 2) 1 2 1 2 (3, 4) 3 4
Note: Using df['b'].apply(pd.Series) instead of pd.DataFrame(df['b'].tolist(), index=df.index) also works. However, it is slower and requires more memory.
The above is the detailed content of How to Split a Column of Tuples into Multiple Columns in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!