Home > Backend Development > Python Tutorial > ## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?

## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?

Susan Sarandon
Release: 2024-10-27 10:55:30
Original
365 people have browsed it

## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?

Calculating Frequency Counts for Distinct Values in NumPy Arrays

Finding the frequency of occurrence for individual values within a NumPy array is a common task in data analysis. This article outlines an efficient approach to obtain these frequency counts.

Method:

The primary method for obtaining frequency counts in NumPy is through the np.unique function, specifically by setting return_counts=True. For instance, consider the following array:

<code class="python">x = np.array([1,1,1,2,2,2,5,25,1,1])</code>
Copy after login

To compute the frequency counts of these elements:

<code class="python">import numpy as np

unique, counts = np.unique(x, return_counts=True)

print(np.asarray((unique, counts)).T)</code>
Copy after login

This will output:

[[ 1  5]
 [ 2  3]
 [ 5  1]
 [25  1]]
Copy after login

As you can see, the resulting array contains the unique values (in the first column) and their respective frequencies (in the second column).

Comparison and Performance:

The np.unique method with return_counts=True offers improved performance compared to other approaches, such as scipy.stats.itemfreq. For large arrays, the time taken by np.unique is significantly reduced, as demonstrated in the following benchmark comparison:

<code class="python">x = np.random.random_integers(0,100,1e6)

%timeit unique, counts = np.unique(x, return_counts=True) # 31.5 ms per loop

%timeit scipy.stats.itemfreq(x) # 170 ms per loop</code>
Copy after login

Conclusion:

The np.unique function in NumPy provides an efficient solution for obtaining the frequency counts of unique values in an array. Its performance advantage over alternative methods makes it a preferred choice for large datasets.

The above is the detailed content of ## How to Efficiently Calculate Frequency Counts for Distinct Values in NumPy Arrays?. For more information, please follow other related articles on the PHP Chinese website!

source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template