Pandas Long to Wide Reshaping with Multiple Variables
Converting data from long to wide format in Pandas can be challenging, especially when multiple variables are involved. This question explores a method for reshaping data using the pivot function.
The original data provided is:
Salesman Height product price Knut 6 bat 5 Knut 6 ball 1 Knut 6 wand 3 Steve 5 pen 2
The desired wide format is:
Salesman Height product_1 price_1 product_2 price_2 product_3 price_3 Knut 6 bat 5 ball 1 wand 3 Steve 5 pen 2 NA NA NA NA
One approach, as suggested by Chris Albon, involves using the pivot function as follows:
df.pivot(index='Salesman', columns='product', values='price')
This approach creates a multi-level index, with the Salesman and product columns as the row and column indices, respectively. The price column becomes the values.
The resulting dataframe will be:
product bat ball wand Salesman Knut 5 1 3 Steve 2 NaN NaN
To obtain the desired format, additional steps are needed to stack the columns and extract the product and price values into separate columns. This can be achieved using the stack and reset_index functions as follows:
df.pivot(index='Salesman', columns='product', values='price') \ .stack().reset_index() \ .rename(columns={'level_1':'product', 0:'price'})
The final result will be the desired wide format.
The above is the detailed content of How to Reshape Pandas Data from Long to Wide Format with Multiple Variables Using the Pivot Function?. For more information, please follow other related articles on the PHP Chinese website!