Get a Frequency Count Based on Multiple Dataframe Columns
To determine how often identical rows appear in a dataframe, we can employ Pandas' groupby function. Consider the following example:
data = {'Group': ['Short', 'Short', 'Moderate', 'Moderate', 'Tall'], 'Size': ['Small', 'Small', 'Medium', 'Small', 'Large']} df = pd.DataFrame(data)
We can calculate the frequency count in three ways:
Option 1:
dfg = df.groupby(by=["Group", "Size"]).size()
This produces a Series with the following output:
Group Size Moderate Medium 1 Small 1 Short Small 2 Tall Large 1 dtype: int64
Option 2:
dfg = df.groupby(by=["Group", "Size"]).size().reset_index(name="Time")
This results in a DataFrame with an added "Time" column:
Group Size Time 0 Moderate Medium 1 1 Moderate Small 1 2 Short Small 2 3 Tall Large 1
Option 3:
dfg = df.groupby(by=["Group", "Size"], as_index=False).size()
This also produces a DataFrame, equivalent to the output of Option 2.
The above is the detailed content of How to Count the Frequency of Identical Rows in a Pandas DataFrame?. For more information, please follow other related articles on the PHP Chinese website!