In pandas, you may encounter dataframes with cells containing lists of multiple values. Instead of storing multiple values in a single cell, it can be beneficial to expand the dataframe so that each item in the list occupies its own row.
Pandas version 0.25 and above introduce the .explode() method for both Series and DataFrame. This method effectively separates list elements into distinct rows.
To explode a column, simply use the following syntax:
df.explode('column_name')
For example, let's consider the following dataframe:
import pandas as pd import numpy as np df = pd.DataFrame( {'trial_num': [1, 2, 3, 1, 2, 3], 'subject': [1, 1, 1, 2, 2, 2], 'samples': [list(np.random.randn(3).round(2)) for i in range(6)] } )
To explode the 'samples' column, we would use:
df_exploded = df.explode('samples')
This would produce the desired output:
subject trial_num samples 0 1 1 0.57 1 1 1 -0.83 2 1 1 1.44 3 1 2 -0.01 4 1 2 1.13 5 1 2 0.36 6 1 3 1.18 # etc.
The .explode() method can handle mixed columns of lists and scalars, as well as empty lists and NaNs. However, it's important to note that it can only explode a single column at a time.
The above is the detailed content of How to Explode Lists Within Pandas DataFrames?. For more information, please follow other related articles on the PHP Chinese website!