Exploding List-Like Columns: A Guide to Expanding Dataframes
Problem:
In Pandas dataframes, some cells may contain lists of multiple values. The goal is to transform the dataframe so that each list element occupies a separate row, while preserving values in other columns.
Solution:
Method 1: repeat()
Prior to Pandas 0.25, the repeat() method was commonly used to explode list columns:
import pandas as pd import numpy as np df = pd.DataFrame( {'trial_num': [1, 2, 3, 1, 2, 3], 'subject': [1, 1, 1, 2, 2, 2], 'samples': [list(np.random.randn(3).round(2)) for i in range(6)] } ) # Expand 'samples' column into separate rows using repeat() df_exploded = df.assign( samples=df['samples'].str.join(',').str.split(',') ).explode('samples') df_exploded = df_exploded.reset_index(drop=True) # Add sample_num column to track list element order df_exploded['sample_num'] = df_exploded.groupby('trial_num').cumcount()
Method 2: explode() (Pandas >= 0.25)
With the release of Pandas 0.25, the .explode() method provides an elegant solution:
df.explode('samples').reset_index(drop=True)
This method automatically handles empty lists and preserves NaNs, ensuring a comprehensive conversion.
Note:
The above is the detailed content of How to Effectively Explode List-Like Columns in Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!