GroupBy and Convert DataFrame to Dictionary of Lists
Given an Excel file containing structured data, the goal is to extract the data, group it by a specific column, and store it in a Python dictionary with lists as values. The data format in the Excel sheet is:
Column1 | Column2 | Column3 |
---|---|---|
0 | 23 | 1 |
1 | 5 | 2 |
1 | 2 | 3 |
1 | 19 | 5 |
2 | 56 | 1 |
2 | 22 | 2 |
3 | 2 | 4 |
3 | 14 | 5 |
4 | 59 | 1 |
5 | 44 | 1 |
5 | 1 | 2 |
5 | 87 | 3 |
Solution 1: groupby.apply().to_dict()
To convert the data into the desired dictionary format, follow these steps:
<code class="python">import pandas as pd excel = pd.read_excel(r"e:\test_data.xlsx", sheetname='mySheet', parse_cols='A,C') result = excel.groupby('Column1')['Column3'].apply(list).to_dict() print(result)</code>
Solution 2: Dictionary Comprehension
Alternatively, you can use a dictionary comprehension to achieve the same result:
<code class="python">result = {k: list(v) for k, v in excel.groupby('Column1')['Column3']} print(result)</code>
The output of both solutions will be:
{0: [1], 1: [2, 3, 5], 2: [1, 2], 3: [4, 5], 4: [1], 5: [1, 2, 3]}
The above is the detailed content of How to Group a DataFrame by a Column and Convert It to a Dictionary of Lists in Python?. For more information, please follow other related articles on the PHP Chinese website!