Home > Backend Development > Python Tutorial > How to Unnest a Pandas DataFrame Column (or Multiple Columns) into Multiple Rows?

How to Unnest a Pandas DataFrame Column (or Multiple Columns) into Multiple Rows?

DDD
Release: 2024-12-29 00:39:11
Original
303 people have browsed it

How to Unnest a Pandas DataFrame Column (or Multiple Columns) into Multiple Rows?

How to Unnest a Column in a Pandas DataFrame into Multiple Rows

One of the challenges in data manipulation with Pandas is dealing with columns containing lists. When these list-type columns need to be split into separate rows, the process is known as "unnesting" or "exploding."

Pandas Unnesting Methods

Method 1: pandas.DataFrame.explode

For a DataFrame with a single column to be unnested, the pandas.DataFrame.explode function can be used. It takes the column name as an argument.

df.explode('B')  # dataframe with column 'B' containing lists
Copy after login

Method 2: Using Repeat and DataFrame Constructor

This method combines repeat and the DataFrame constructor. It repeats the values in the column based on the length of the lists and then concatenates them.

df = pd.DataFrame({'A': df.A.repeat(df.B.str.len()), 'B': np.concatenate(df.B.values)})
Copy after login

Method 3: Recreate the List

Re-creating the list involves converting the old column into a list of tuples containing the column's value and each element of the list.

pd.DataFrame([[x] + [z] for x, y in df.values for z in y], columns=df.columns)
Copy after login

Method 4: Using Reindex

Reindex creates a new DataFrame with repeated indices for the elements in the list. The column is then assigned the concatenated elements.

df.reindex(df.index.repeat(df.B.str.len())).assign(B=np.concatenate(df.B.values))
Copy after login

Generalizing to Multiple Columns

For unnesting multiple columns, a custom function can be defined. It takes the DataFrame and a list of column names to explode.

def unnesting(df, explode):
    idx = df.index.repeat(df[explode[0]].str.len())
    df1 = pd.concat([
        pd.DataFrame({x: np.concatenate(df[x].values)}) for x in explode], axis=1)
    df1.index = idx
    return df1.join(df.drop(explode, 1), how='left')
Copy after login

Horizontal Unnesting

To unnest horizontally, the add_prefix method can be employed to create a series of new columns.

df.join(pd.DataFrame(df.B.tolist(), index=df.index).add_prefix('B_'))
Copy after login

The above is the detailed content of How to Unnest a Pandas DataFrame Column (or Multiple Columns) into Multiple Rows?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template