Counting Unique Values per Groups with Pandas
When working with tabular data, it often becomes necessary to count the unique occurrences of values within specific groups. To achieve this in Python using the Pandas library, we can utilize the groupby() and nunique() methods.
Problem Explanation:
To illustrate the problem, consider the following dataset:
ID | domain |
---|---|
123 | vk.com |
123 | vk.com |
123 | twitter.com |
456 | vk.com' |
456 | facebook.com |
456 | vk.com |
456 | google.com |
789 | twitter.com |
789 | vk.com |
The task at hand is to count the unique ID values within each domain.
Solution:
To count unique values per group, we can use the following code:
<code class="python">df = df.groupby('domain')['ID'].nunique()</code>
The groupby() method groups the data by the domain column, while the nunique() method counts the unique occurrences of ID within each group. The output is a Series with the domain names as index and the corresponding unique counts as values.
domain vk.com 3 twitter.com 2 facebook.com 1 google.com 1
Additional Notes:
Example with String Manipulation:
<code class="python">df['clean_domain'] = df.domain.str.strip("'") df = df.groupby('clean_domain')['ID'].nunique()</code>
Example with agg():
<code class="python">df = df.groupby(by='domain', as_index=False).agg({'ID': pd.Series.nunique})</code>
Das obige ist der detaillierte Inhalt vonWie zähle ich mit Pandas eindeutige Werte pro Gruppe?. Für weitere Informationen folgen Sie bitte anderen verwandten Artikeln auf der PHP chinesischen Website!