Reshaping Data from Long to Wide in Pandas: A Comprehensive Guide
Many datasets are initially stored in long format, where each row represents a single observation and multiple variables are listed as columns. However, it often becomes necessary to reshape the data into wide format, where each row corresponds to a unique combination of values from two or more variables.
Issue: Transforming data from long to wide format can be a cumbersome task in Pandas, especially when using the melt/stack/unstack methods. For instance, consider the following long-format dataframe:
<code class="python">import pandas as pd data = pd.DataFrame({ 'Salesman': ['Knut', 'Knut', 'Knut', 'Steve'], 'Height': [6, 6, 6, 5], 'product': ['bat', 'ball', 'wand', 'pen'], 'price': [5, 1, 3, 2] })</code>
Reshaping to Wide Format:
To reshape the data into wide format, we can utilize Chris Albon's solution:
Create Long Dataframe:
<code class="python">raw_data = { 'patient': [1, 1, 1, 2, 2], 'obs': [1, 2, 3, 1, 2], 'treatment': [0, 1, 0, 1, 0], 'score': [6252, 24243, 2345, 2342, 23525] } df = pd.DataFrame(raw_data, columns=['patient', 'obs', 'treatment', 'score'])</code>
Reshape to Wide:
<code class="python">df.pivot(index='patient', columns='obs', values='score')</code>
This will generate the desired wide-format dataframe:
<code class="python">obs 1 2 3 patient 1 6252.0 24243.0 2345.0 2 2342.0 23525.0 NaN</code>
The above is the detailed content of How to Reshape Data from Long to Wide Format in Pandas: A Step-by-Step Guide. For more information, please follow other related articles on the PHP Chinese website!