如何使用 Pandas 計算每個組的唯一值?

Susan Sarandon
發布: 2024-10-18 15:49:03
原創
1024 人瀏覽過

How to Count Unique Values per Groups with Pandas?

Counting Unique Values per Groups with Pandas

When working with tabular data, it often becomes necessary to count the unique occurrences of values within specific groups. To achieve this in Python using the Pandas library, we can utilize the groupby() and nunique() methods.

Problem Explanation:

To illustrate the problem, consider the following dataset:

ID domain
123 vk.com
123 vk.com
123 twitter.com
456 vk.com'
456 facebook.com
456 vk.com
456 google.com
789 twitter.com
789 vk.com

The task at hand is to count the unique ID values within each domain.

Solution:

To count unique values per group, we can use the following code:

<code class="python">df = df.groupby('domain')['ID'].nunique()</code>
登入後複製

The groupby() method groups the data by the domain column, while the nunique() method counts the unique occurrences of ID within each group. The output is a Series with the domain names as index and the corresponding unique counts as values.

domain
vk.com        3
twitter.com   2
facebook.com  1
google.com    1
登入後複製

Additional Notes:

  • If the domain column values contain single quotes ('), you can remove them before grouping using the str.strip("'") method.
  • To retain the column name in the output, use the agg() method with the pd.Series.nunique function.

Example with String Manipulation:

<code class="python">df['clean_domain'] = df.domain.str.strip("'")
df = df.groupby('clean_domain')['ID'].nunique()</code>
登入後複製

Example with agg():

<code class="python">df = df.groupby(by='domain', as_index=False).agg({'ID': pd.Series.nunique})</code>
登入後複製

以上是如何使用 Pandas 計算每個組的唯一值?的詳細內容。更多資訊請關注PHP中文網其他相關文章!

來源:php
本網站聲明
本文內容由網友自願投稿,版權歸原作者所有。本站不承擔相應的法律責任。如發現涉嫌抄襲或侵權的內容,請聯絡admin@php.cn
作者最新文章
熱門教學
更多>
最新下載
更多>
網站特效
網站源碼
網站素材
前端模板