Home > Backend Development > Python Tutorial > Python Counters: How to use collections.Counter?

Python Counters: How to use collections.Counter?

Release: 2023-05-08 13:34:07
1199 people have browsed it

    1. Introduction

    A counter tool provides fast and convenient counting. Counter is a subclass of dict, used to count hashable objects. It is a collection with elements stored like dictionary keys and their counts as values. Counts can be any integer value, including 0 and negative numbers, and the Counter class is a bit like bags or multisets in other languages. To put it simply, it can be counted statistically. Let’s take a look at a few examples to make it clear.

    from collections import Counter
    import re
    text = 'remove an existing key one level down remove an existing key one level down'
    words = re.findall(r'\w+', text)
    [('remove', 2),('an', 2),('existing', 2),('key', 2),('one', 2)('level', 2),('down', 2)] 
    cnt = Counter()
    for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
        cnt[word] += 1
    Counter({'red': 2, 'blue': 3, 'green': 1})
    L = ['red', 'blue', 'red', 'green', 'blue', 'blue'] 
    Counter({'red': 2, 'blue': 3, 'green': 1}
    Copy after login

    Elements are counted from an iterable or initialized from other mapping (or counter):

    from collections import Counter
    Counter({'g': 1, 'a': 3, 'l': 2, 'h': 1, 'd': 1})
    Counter({'red': 4, 'blue': 2})  
    Counter({'red': 4, 'blue': 2})
    Counter(cats=4, dogs=8)
    Counter({'cats': 4, 'dogs': 8})
    Counter(['red', 'blue', 'red', 'green', 'blue', 'blue'])
    Counter({'red': 2, 'blue': 3, 'green': 1})
    Copy after login

    2. Basic operations

    1. Statistics" The number of occurrences of each element in iterable sequence"

    1.1 Effect on list/string

    The following are two ways to use it, one is to use it directly, and the other is to instantiate it If you want to call it frequently, obviously the latter one is more concise, because you can easily call various methods in Counter, and the same routine is used for other iterable sequences.

    from collections import Counter
    list_01 = [1,9,9,5,0,8,0,9]  #GNZ48-陈珂生日
    print(Counter(list_01))  #Counter({9: 3, 0: 2, 1: 1, 5: 1, 8: 1})
    temp = Counter('abcdeabcdabcaba')
    print(temp)  #Counter({'a': 5, 'b': 4, 'c': 3, 'd': 2, 'e': 1})
    Copy after login

    1.2 Output results

    print( type(temp) ) #<class &#39;collections.Counter&#39;>
    print( dict(temp) ) #{&#39;b&#39;: 4, &#39;a&#39;: 5, &#39;c&#39;: 3, &#39;d&#39;: 2, &#39;e&#39;: 1}
    for num,count in enumerate(dict(temp).items()):
    (&#39;e&#39;, 1)
    (&#39;c&#39;, 3)
    (&#39;a&#39;, 5)
    (&#39;b&#39;, 4)
    (&#39;d&#39;, 2)
    Copy after login

    1.3 Use the built-in items() method to output

    Obviously this method is more convenient than converting to a dictionary and then outputting it:

    print(temp.items()) #dict_items([(&#39;e&#39;, 1), (&#39;c&#39;, 3), (&#39;b&#39;, 4), (&#39;d&#39;, 2), (&#39;a&#39;, 5)])
    for item in temp.items():
    (&#39;a&#39;, 5)
    (&#39;c&#39;, 3)
    (&#39;d&#39;, 2)
    (&#39;e&#39;, 1)
    (&#39;b&#39;, 4)
    Copy after login

    2. most_common() counts the elements with the most occurrences

    Use the most_common() method to return a list containing the n most common elements and the number of occurrences, in order of commonness Sort to low. If n is omitted or None, most_common() will return all elements in the counter. Elements with equal count values ​​are sorted in the order of first appearance. Words often used to calculate top word frequency:

    from collections import Counter
    list_01 = [1,9,9,5,0,8,0,9]
    temp = Counter(list_01)
    print(temp.most_common(1))   #[(9, 3)]  元素“9”出现3次。
    print(temp.most_common(2)) #[(9, 3), (0, 2)]  统计出现次数最多个两个元素
    print(temp.most_common())  #[(9, 3), (0, 2), (1, 1), (5, 1), (8, 1)]
    Copy after login
    [(&#39;a&#39;, 5), (&#39;b&#39;, 2), (&#39;r&#39;, 2)]
    [(&#39;a&#39;, 5), (&#39;b&#39;, 2), (&#39;r&#39;, 2), (&#39;c&#39;, 1), (&#39;d&#39;, 1)]
    Copy after login

    3. elements () and sort() methods

    Description: Returns an iterator in which each element will be repeated the number of times specified by the count value. Elements are returned in order of first occurrence. If an element's count is less than 1, elements() will ignore it.

    c = Counter(a=4, b=2, c=0, d=-2)
    [&#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;b&#39;, &#39;b&#39;]
    [&#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;b&#39;, &#39;b&#39;]
    c = Counter(a=4, b=2, c=0, d=5)
    [&#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;a&#39;, &#39;b&#39;, &#39;b&#39;, &#39;d&#39;, &#39;d&#39;, &#39;d&#39;, &#39;d&#39;, &#39;d&#39;]
    Copy after login
    from collections import Counter
    c = Counter(&#39;ABCABCCC&#39;)
    print(c.elements()) #<itertools.chain object at 0x0000027D94126860>
    print(list(c.elements())) #[&#39;A&#39;, &#39;A&#39;, &#39;C&#39;, &#39;C&#39;, &#39;C&#39;, &#39;C&#39;, &#39;B&#39;, &#39;B&#39;]
    print(sorted(c.elements()))  #[&#39;A&#39;, &#39;A&#39;, &#39;B&#39;, &#39;B&#39;, &#39;C&#39;, &#39;C&#39;, &#39;C&#39;, &#39;C&#39;]
    #这里与sorted的作用是: list all unique elements,列出所有唯一元素
    print( sorted(c) ) #[&#39;A&#39;, &#39;B&#39;, &#39;C&#39;]
    Copy after login

    Official document example:

    # Knuth&#39;s example for prime factors of 1836:  2**2 * 3**3 * 17**1
    prime_factors = Counter({2: 2, 3: 3, 17: 1})
    product = 1
    for factor in prime_factors.elements():  # loop over factors
        product *= factor  # and multiply them
    print(product)  #1836
    #1836 = 2*2*3*3*3*17
    Copy after login

    4. subtract() subtraction operation: the output will not ignore the count whose result is zero or less than zero

    Subtract elements from an iterable or mapped object. Both input and output can be 0 or negative.

    c = Counter(a=4, b=2, c=0, d=-2)
    d = Counter(a=1, b=2, c=3, d=4)
    Counter({&#39;a&#39;: 3, &#39;b&#39;: 0, &#39;c&#39;: -3, &#39;d&#39;: -6})
    str0 = Counter(&#39;aabbccdde&#39;)
    Counter({&#39;a&#39;: 2, &#39;b&#39;: 2, &#39;c&#39;: 2, &#39;d&#39;: 2, &#39;e&#39;: 1})
    Counter({&#39;a&#39;: 1, &#39;b&#39;: 1, &#39;c&#39;: 1, &#39;d&#39;: 1, &#39;e&#39;: 1}
    Copy after login
    subtract_test01 = Counter("AAB")
    print(subtract_test01)  #Counter({&#39;A&#39;: 2, &#39;B&#39;: 0, &#39;C&#39;: -2})
    Copy after login

    The count here can be reduced to zero and can include zero and negative numbers:

    subtract_test02 = Counter("which")
    subtract_test02.subtract("witch")  #从另一个迭代序列中减去元素
    subtract_test02.subtract(Counter("watch"))  #^……
    print( subtract_test02["h"] )  # 0 ,whirch 中两个,减去witch中一个,减去watch中一个,剩0个
    print( subtract_test02["w"] )  #-1
    Copy after login

    5. Dictionary method

    Usually dictionary methods can be used for Counter objects, except There are two methods that work differently than dictionaries.

    • fromkeys(iterable): This class method is not implemented in Counter.

    • update([iterable-or-mapping]): Count elements from the iterable object or add from another mapping object (or counter), the number of elements is added. In addition, the iteration object should be a sequence element, not a (key, value) pair.

    sum(c.values())                 # total of all counts
    c.clear()                       # reset all counts
    list(c)                         # list unique elements
    set(c)                          # convert to a set
    dict(c)                         # convert to a regular dictionary
    c.items()                       # convert to a list of (elem, cnt) pairs
    Counter(dict(list_of_pairs))    # convert from a list of (elem, cnt) pairs
    c.most_common(n)                   # n least common elements
    +c                              # remove zero and negative counts
    Copy after login

    6. Mathematical operations

    This function is very powerful and provides several mathematical operations that can be combined with Counter objects to produce multisets (elements greater than 0 in the counter ). Addition and subtraction combine counters by adding or subtracting the corresponding count of elements. Intersection and union return the minimum or maximum value of the corresponding count. Each operation accepts signed counts, but the output ignores counts whose result is zero or less than zero.

    c = Counter(a=3, b=1)
    d = Counter(a=1, b=2)
    c + d                       # add two counters together:  c[x] + d[x]
    Counter({&#39;a&#39;: 4, &#39;b&#39;: 3})
    c - d                       # subtract (keeping only positive counts)
    Counter({&#39;a&#39;: 2})
    c & d                       # intersection:  min(c[x], d[x]) 
    Counter({&#39;a&#39;: 1, &#39;b&#39;: 1})
    c | d                       # union:  max(c[x], d[x])
    Counter({&#39;a&#39;: 3, &#39;b&#39;: 2})
    Copy after login
    print(Counter(&#39;AAB&#39;) + Counter(&#39;BCC&#39;))
    #Counter({&#39;B&#39;: 2, &#39;C&#39;: 2, &#39;A&#39;: 2})
    #Counter({&#39;A&#39;: 2})
    Copy after login

    And" and "OR" operations:

    print(Counter(&#39;AAB&#39;) & Counter(&#39;BBCC&#39;))
    #Counter({&#39;B&#39;: 1})
    print(Counter(&#39;AAB&#39;) | Counter(&#39;BBCC&#39;))
    #Counter({&#39;A&#39;: 2, &#39;C&#39;: 2, &#39;B&#39;: 2})
    Copy after login

    Unidirectional addition and subtraction (unary operators) means adding or subtracting from the empty counter, which is equivalent to multiplying the count value by positive Value or negative value, the output will also ignore the count whose result is zero or less than zero:

    c = Counter(a=2, b=-4)
    Counter({&#39;a&#39;: 2})
    Counter({&#39;b&#39;: 4})
    Copy after login

    Write an algorithm to calculate text similarity, weighted similarity:

    def str_sim(str_0,str_1,topn):
        topn = int(topn)
        collect0 = Counter(dict(Counter(str_0).most_common(topn)))
        collect1 = Counter(dict(Counter(str_1).most_common(topn)))       
        jiao = collect0 & collect1
        bing = collect0 | collect1       
        sim = float(sum(jiao.values()))/float(sum(bing.values()))        
    str_0 = &#39;定位手机定位汽车定位GPS定位人定位位置查询&#39;         
    str_1 = &#39;导航定位手机定位汽车定位GPS定位人定位位置查询&#39;         
    Copy after login

    7. Calculate the total number of elements, Keys() and Values()

    from collections import Counter
    c = Counter(&#39;ABCABCCC&#39;)
    print(sum(c.values()))  # 8  total of all counts
    print(c.keys())  #dict_keys([&#39;A&#39;, &#39;B&#39;, &#39;C&#39;])
    print(c.values())  #dict_values([2, 2, 4])
    Copy after login

    8. Query single element results

    from collections import Counter
    c = Counter(&#39;ABBCC&#39;)
    print(c["A"])  #1
    Copy after login

    9. Add

    for elem in &#39;ADD&#39;:  # update counts from an iterabl
        c[elem] += 1
    print(c.most_common())  #[(&#39;C&#39;, 2), (&#39;D&#39;, 2), (&#39;A&#39;, 2), (&#39;B&#39;, 2)]
    Copy after login

    10. Delete (del)

    del c["D"]
    print(c.most_common())  #[(&#39;C&#39;, 2), (&#39;A&#39;, 2), (&#39;B&#39;, 2)]
    del c["C"]
    print(c.most_common())  #[(&#39;A&#39;, 2), (&#39;B&#39;, 2)]
    Copy after login

    11. Update update()

    d = Counter("CCDD")
    print(c.most_common())  #[(&#39;B&#39;, 2), (&#39;A&#39;, 2), (&#39;C&#39;, 2), (&#39;D&#39;, 2)]
    Copy after login

    12. Clear clear()

    print(c)  #Counter()
    Copy after login

    3. Summary

    Counter is a dict subclass, mainly used to access you The frequency of objects is counted.

    Commonly used methods:

    • elements(): Returns an iterator, the number of repeated calculations for each element, if the count of an element If it is less than 1, it will be ignored.

    • most_common([n]): Returns a list providing the n most frequently accessed elements and their count

    • subtract([iterable-or-mapping]): Subtract elements from the iterable object. The input and output can be 0 or negative numbers, which is different from the role of the minus sign -

    • update ([iterable-or-mapping]): Count elements from an iterable object or add from another mapping object (or counter).


    # 统计字符出现的次数
    >>> import collections
    >>> collections.Counter(&#39;hello world&#39;)
    Counter({&#39;l&#39;: 3, &#39;o&#39;: 2, &#39;h&#39;: 1, &#39;e&#39;: 1, &#39; &#39;: 1, &#39;w&#39;: 1, &#39;r&#39;: 1, &#39;d&#39;: 1})
    # 统计单词数
    >>> collections.Counter(&#39;hello world hello world hello nihao&#39;.split())
    Counter({&#39;hello&#39;: 3, &#39;world&#39;: 2, &#39;nihao&#39;: 1})
    Copy after login

    Commonly used Method:

    >>> c = collections.Counter(&#39;hello world hello world hello nihao&#39;.split())
    >>> c
    Counter({&#39;hello&#39;: 3, &#39;world&#39;: 2, &#39;nihao&#39;: 1})
    # 获取指定对象的访问次数,也可以使用get()方法
    >>> c[&#39;hello&#39;]
    >>> c = collections.Counter(&#39;hello world hello world hello nihao&#39;.split())
    # 查看元素
    >>> list(c.elements())
    [&#39;hello&#39;, &#39;hello&#39;, &#39;hello&#39;, &#39;world&#39;, &#39;world&#39;, &#39;nihao&#39;]
    # 追加对象,或者使用c.update(d)
    >>> c = collections.Counter(&#39;hello world hello world hello nihao&#39;.split())
    >>> d = collections.Counter(&#39;hello world&#39;.split())
    >>> c
    Counter({&#39;hello&#39;: 3, &#39;world&#39;: 2, &#39;nihao&#39;: 1})
    >>> d
    Counter({&#39;hello&#39;: 1, &#39;world&#39;: 1})
    >>> c + d
    Counter({&#39;hello&#39;: 4, &#39;world&#39;: 3, &#39;nihao&#39;: 1})
    # 减少对象,或者使用c.subtract(d)
    >>> c - d
    Counter({&#39;hello&#39;: 2, &#39;world&#39;: 1, &#39;nihao&#39;: 1})
    # 清除
    >>> c.clear()
    >>> c
    Copy after login

    The above is the detailed content of Python Counters: How to use collections.Counter?. For more information, please follow other related articles on the PHP Chinese website!

    Related labels:
    Statement of this Website
    The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
    Popular Tutorials
    Latest Downloads
    Web Effects
    Website Source Code
    Website Materials
    Front End Template