Restructure Tabular Data with Pandas: Converting Columns to Rows
Dealing with data organized in columns for different dates or categories can be cumbersome. Pandas provides a powerful solution to convert such columns into easily readable rows. This article addresses the specific challenge of transforming a CSV with location information and multiple date columns into a desired format.
Problem Statement:
Consider a dataset with information organized by location, each with multiple dates as column headers. The goal is to restructure the data into a format where each row represents a unique location, name, date, and its corresponding value.
Pandas Solution:
Pandas offers a highly efficient method to achieve this transformation using its melt function.
Code:
df.melt(id_vars=["location", "name"], var_name="Date", value_name="Value")
Explanation:
Result:
Applying the melt function to the provided dataset yields the desired output:
location | name | Date | Value |
---|---|---|---|
A | "test" | Jan-2010 | 12 |
B | "foo" | Jan-2010 | 18 |
A | "test" | Feb-2010 | 20 |
B | "foo" | Feb-2010 | 20 |
A | "test" | March-2010 | 30 |
B | "foo" | March-2010 | 25 |
This new arrangement makes it easier to analyze the data by location, name, and date.
Note:
For older versions of Pandas (0.20 or earlier), the following alternate approach can be used:
df2 = pd.melt(df, id_vars=["location", "name"], var_name="Date", value_name="Value") df2 = df2.sort(["location", "name"])
The above is the detailed content of How Can Pandas' `melt()` Function Restructure Tabular Data from Columns to Rows?. For more information, please follow other related articles on the PHP Chinese website!