Melting a Pandas DataFrame involves transposing data from a wide format to a long format. This is useful when manipulating data for analysis, visualization, or other purposes. Let's delve into various melting scenarios with sample problems and their corresponding solutions:
Question: How do I melt a DataFrame so that the following format is achieved?
Name Age Subject Grade 0 Bob 13 English C 1 John 16 English B 2 Foo 16 English B 3 Bar 15 English A+ 4 Alex 17 English F 5 Tom 12 English A 6 Bob 13 Math A+ 7 John 16 Math B 8 Foo 16 Math A 9 Bar 15 Math F 10 Alex 17 Math D 11 Tom 12 Math C
Solution:
To melt the DataFrame, use df.melt(). Specify the id_vars (columns to remain in the original form) and var_name and value_name for the new column names:
pd.melt(df, id_vars=['Name', 'Age'], var_name='Subject', value_name='Grade')
Question: How do I melt the DataFrame from Problem 1 and filter out the 'English' column?
Solution:
Use value_vars in df.melt() to specify which columns to melt. In this case, only ['Math'] is selected:
pd.melt(df, id_vars=['Name', 'Age'], value_vars='Math', var_name='Subject', value_name='Grade')
Question: How do I sort the melted data from Problem 1 by score and group the students by name?
Solution:
Use .groupby() and .agg() on the melted DataFrame to group by 'Grade' and concatenate the names and subjects into comma-separated strings:
melted_df.groupby('Grade', as_index=False).agg(", ".join)
Question: How do I unmelt a DataFrame that has been melted?
Solution:
Use DataFrame.pivot_table() to unmelt the DataFrame. Specify the values, index, and columns.
melted_df.pivot_table('Grades', ['Name', 'Age'], 'Subject', aggfunc='first').reset_index().droplevel(level=0, axis=1)
Question: How do I group the DataFrame by name and separate the subjects and grades by comma?
Solution:
Melt the DataFrame and use .groupby() and .agg() to concatenate the subjects and grades:
pd.melt(df, id_vars=['Name'], var_name='Subject', value_name='Grade').groupby('Name', as_index=False).agg(", ".join)
Question: How do I melt all columns in a DataFrame?
Solution:
Use df.stack().reset_index() to transpose the DataFrame:
pd.melt(df, id_vars=None, var_name='Column', value_name='Value')
These solutions demonstrate the versatility of melting and unmelting pandas DataFrames to manipulate data for various purposes.
The above is the detailed content of How Do I Melt and Unmelt Pandas DataFrames for Data Manipulation?. For more information, please follow other related articles on the PHP Chinese website!