A common task in data analysis is to determine the frequency of occurrence for each unique value in a given dataset. NumPy provides several efficient ways to achieve this for arrays of numeric data.
One approach is to utilize the np.unique function with the return_counts parameter set to True (available in NumPy version 1.9 and later). This parameter returns not only the unique values but also their corresponding counts.
<code class="python">import numpy as np x = np.array([1,1,1,2,2,2,5,25,1,1]) unique, counts = np.unique(x, return_counts=True) print(np.asarray((unique, counts)).T) ''' Output: [[ 1 5] [ 2 3] [ 5 1] [25 1]] '''</code>
This method outperforms scipy.stats.itemfreq in terms of efficiency, as demonstrated by the following timing comparison:
<code class="python">import numpy as np import scipy.stats x = np.random.random_integers(0,100,1e6) %timeit unique, counts = np.unique(x, return_counts=True) 10 loops, best of 3: 31.5 ms per loop %timeit scipy.stats.itemfreq(x) 10 loops, best of 3: 170 ms per loop</code>
The above is the detailed content of **How can I efficiently count the frequency of unique values in a NumPy array?**. For more information, please follow other related articles on the PHP Chinese website!