Detailed explanation of defaultdict in Python (code example)-Python Tutorial-php.cn

This article brings you a detailed explanation (code example) of defaultdict in Python. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.

Default values can be very convenient

As we all know, in Python, if you access a key that does not exist in the dictionary, a KeyError exception will be raised (in JavaScript, if a certain key does not exist in the object attribute, returns undefined). But sometimes it is very convenient to have a default value for every key in the dictionary. For example, the following example:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    counts[kw] += 1

Copy after login

This example counts the number of times a word appears in strings and records it in the counts dictionary. Every time a word appears, the value stored in the key corresponding to counts is incremented by 1. But in fact, running this code will throw a KeyError exception. The timing of occurrence is when each word is counted for the first time. Because there is no default value in Python's dict, it can be verified in the Python command line:

>>> counts = dict()
>>> counts
{}
>>> counts[&#39;puppy&#39;] += 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: &#39;puppy&#39;

Copy after login

Use judgment statements to check

In this case, the first method that may come to mind is to store the default value of 1 in the corresponding key in counts when the word is counted for the first time. This requires adding a judgment statement during processing:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    if kw not in counts:
        counts[kw] = 1
    else:
        counts[kw] += 1
# counts:
# {&#39;puppy&#39;: 5, &#39;weasel&#39;: 1, &#39;kitten&#39;: 2}

Copy after login

Use the dict.setdefault() method

You can also set the default value through the dict.setdefault() method:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    counts.setdefault(kw, 0)
    counts[kw] += 1

Copy after login

The dict.setdefault() method receives two parameters. The first parameter is the name of the key, and the second parameter is the default value. If the given key does not exist in the dictionary, the default value provided in the parameter is returned; otherwise, the value saved in the dictionary is returned. The code in the for loop can be rewritten using the return value of the dict.setdefault() method to make it more concise:

strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = {}
for kw in strings:
    counts[kw] = counts.setdefault(kw, 0) + 1

Copy after login

Use the collections.defaultdict class

Although the above method is to a certain extent This solves the problem that there is no default value in dict, but at this time we will wonder, is there a dictionary that itself provides the function of default value? The answer is yes, it is collections.defaultdict.

The defaultdict class is like a dict, but it is initialized using a type:

>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> dd
defaultdict(<type &#39;list&#39;>, {})

Copy after login

The initialization function of the defaultdict class accepts a type as a parameter. When the key being accessed does not exist, it can be instantiated. Change a value as the default value:

>>> dd[&#39;foo&#39;]
[]
>>> dd
defaultdict(<type &#39;list&#39;>, {&#39;foo&#39;: []})
>>> dd[&#39;bar&#39;].append(&#39;quux&#39;)
>>> dd
defaultdict(<type &#39;list&#39;>, {&#39;foo&#39;: [], &#39;bar&#39;: [&#39;quux&#39;]})

Copy after login

It should be noted that this form of default value can only be passed through dict[key] or dict.__getitem__(key)It is only valid when accessing. The reasons for this will be introduced below.

>>> from collections import defaultdict
>>> dd = defaultdict(list)
>>> &#39;something&#39; in dd
False
>>> dd.pop(&#39;something&#39;)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: &#39;pop(): dictionary is empty&#39;
>>> dd.get(&#39;something&#39;)
>>> dd[&#39;something&#39;]
[]

Copy after login

In addition to accepting the type name as a parameter of the initialization function, this class can also use any callable function without parameters. At that time, the return result of the function will be used as the default value, which makes the default value Values are more flexible. The following uses an example to illustrate how to use the custom function zero() without parameters as the parameter of the initialization function:

>>> from collections import defaultdict
>>> def zero():
...     return 0
...
>>> dd = defaultdict(zero)
>>> dd
defaultdict(<function zero at 0xb7ed2684>, {})
>>> dd[&#39;foo&#39;]
0
>>> dd
defaultdict(<function zero at 0xb7ed2684>, {&#39;foo&#39;: 0})

Copy after login

Use collections.defaultdict to solve the initial word statistics problem , the code is as follows:

from collections import defaultdict
strings = (&#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;, &#39;puppy&#39;,
           &#39;weasel&#39;, &#39;puppy&#39;, &#39;kitten&#39;, &#39;puppy&#39;)
counts = defaultdict(lambda: 0)  # 使用lambda来定义简单的函数
for s in strings:
    counts[s] += 1

Copy after login

How the defaultdict class is implemented

Through the above content, you must have understood the usage of the defaultdict class, so how to implement the default value in the defaultdict class What about the function? The key to this is the use of the __missing__() method:

>>> from collections import defaultdict
>>> print defaultdict.__missing__.__doc__
__missing__(key) # Called by __getitem__ for missing key; pseudo-code:
  if self.default_factory is None: raise KeyError(key)
  self[key] = value = self.default_factory()
  return value

Copy after login

By looking at the docstring of the __missing__() method, we can see that when using the __getitem__() method to access a non-existent key ( The form dict[key] is actually a simplified form of the __getitem__() method), which calls the __missing__() method to obtain the default value and add the key to the dictionary.

For a detailed introduction to the __missing__() method, please refer to the "Mapping Types — dict" section in the official Python documentation.

Introduced in the document, starting from version 2.5, if a subclass derived from dict defines the __missing__() method, when accessing a non-existent key, dict[key] will call the __missing__() method to obtain default value.

It can be seen from this that although dict supports the __missing__() method, this method does not exist in dict itself. Instead, this method needs to be implemented in the derived subclass. This can be easily verified:

>>> print dict.__missing__.__doc__
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object &#39;dict&#39; has no attribute &#39;__missing__&#39;

Copy after login

At the same time, we can do further experiments, define a subclass Missing and implement the __missing__() method:

>>> class Missing(dict):
...     def __missing__(self, key):
...         return &#39;missing&#39;
...
>>> d = Missing()
>>> d
{}
>>> d[&#39;foo&#39;]
&#39;missing&#39;
>>> d
{}

Copy after login

The return result reflects the __missing__( ) method does work. On this basis, we slightly modify the __missing__() method so that this subclass sets a default value for non-existent keys like the defautldict class:

>>> class Defaulting(dict):
...     def __missing__(self, key):
...         self[key] = &#39;default&#39;
...         return &#39;default&#39;
...
>>> d = Defaulting()
>>> d
{}
>>> d[&#39;foo&#39;]
&#39;default&#39;
>>> d
{&#39;foo&#39;: &#39;default&#39;}

Copy after login

Implementing the function of defaultdict in older versions of Python

The defaultdict class was added after version 2.5. It is not supported in some older versions, so it is necessary to implement a compatible defaultdict class for older versions. This is actually very simple. Although the performance may not be as good as the defautldict class that comes with version 2.5, it is functionally the same.

First of all, the __getitem__() method needs to call the __missing__() method when the access key fails:

class defaultdict(dict):
    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)

Copy after login

Secondly, the __missing__() method needs to be implemented to set the default value:

class defaultdict(dict):
    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)
    def __missing__(self, key):
        self[key] = value = self.default_factory()
        return value

Copy after login

Then, the initialization function of the defaultdict class __init__() needs to accept type or callable function parameters:

class defaultdict(dict):
    def __init__(self, default_factory=None, *a, **kw):
        dict.__init__(self, *a, **kw)
        self.default_factory = default_factory    def __getitem__(self, key):
        try:
            return dict.__getitem__(self, key)
        except KeyError:
            return self.__missing__(key)
    def __missing__(self, key):
        self[key] = value = self.default_factory()
        return value

Copy after login

最后，综合以上内容，通过以下方式完成兼容新旧Python版本的代码：

try:
    from collections import defaultdictexcept ImportError:
    class defaultdict(dict):
      def __init__(self, default_factory=None, *a, **kw):
          dict.__init__(self, *a, **kw)
          self.default_factory = default_factory      def __getitem__(self, key):
          try:
              return dict.__getitem__(self, key)
          except KeyError:
              return self.__missing__(key)

      def __missing__(self, key):
          self[key] = value = self.default_factory()
          return value

Copy after login

The above is the detailed content of Detailed explanation of defaultdict in Python (code example). For more information, please follow other related articles on the PHP Chinese website!