This article brings you a detailed explanation (code example) of defaultdict in Python. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
Default values can be very convenient
As we all know, in Python, if you access a key that does not exist in the dictionary, a KeyError exception will be raised (in JavaScript, if a certain key does not exist in the object attribute, returns undefined). But sometimes it is very convenient to have a default value for every key in the dictionary. For example, the following example:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: counts[kw] += 1
This example counts the number of times a word appears in strings and records it in the counts dictionary. Every time a word appears, the value stored in the key corresponding to counts is incremented by 1. But in fact, running this code will throw a KeyError exception. The timing of occurrence is when each word is counted for the first time. Because there is no default value in Python's dict, it can be verified in the Python command line:
>>> counts = dict() >>> counts {} >>> counts['puppy'] += 1 Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'puppy'
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: if kw not in counts: counts[kw] = 1 else: counts[kw] += 1 # counts: # {'puppy': 5, 'weasel': 1, 'kitten': 2}
You can also set the default value through the dict.setdefault() method:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: counts.setdefault(kw, 0) counts[kw] += 1
The dict.setdefault() method receives two parameters. The first parameter is the name of the key, and the second parameter is the default value. If the given key does not exist in the dictionary, the default value provided in the parameter is returned; otherwise, the value saved in the dictionary is returned. The code in the for loop can be rewritten using the return value of the dict.setdefault() method to make it more concise:
strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = {} for kw in strings: counts[kw] = counts.setdefault(kw, 0) + 1
The defaultdict class is like a dict, but it is initialized using a type:
>>> from collections import defaultdict >>> dd = defaultdict(list) >>> dd defaultdict(<type 'list'>, {})
The initialization function of the defaultdict class accepts a type as a parameter. When the key being accessed does not exist, it can be instantiated. Change a value as the default value:
>>> dd['foo'] [] >>> dd defaultdict(<type 'list'>, {'foo': []}) >>> dd['bar'].append('quux') >>> dd defaultdict(<type 'list'>, {'foo': [], 'bar': ['quux']})
It should be noted that this form of default value can only be passed through dict[key]
or dict.__getitem__(key)
It is only valid when accessing. The reasons for this will be introduced below.
>>> from collections import defaultdict >>> dd = defaultdict(list) >>> 'something' in dd False >>> dd.pop('something') Traceback (most recent call last): File "<stdin>", line 1, in <module> KeyError: 'pop(): dictionary is empty' >>> dd.get('something') >>> dd['something'] []
In addition to accepting the type name as a parameter of the initialization function, this class can also use any callable function without parameters. At that time, the return result of the function will be used as the default value, which makes the default value Values are more flexible. The following uses an example to illustrate how to use the custom function zero() without parameters as the parameter of the initialization function:
>>> from collections import defaultdict >>> def zero(): ... return 0 ... >>> dd = defaultdict(zero) >>> dd defaultdict(<function zero at 0xb7ed2684>, {}) >>> dd['foo'] 0 >>> dd defaultdict(<function zero at 0xb7ed2684>, {'foo': 0})
Use collections.defaultdict
to solve the initial word statistics problem , the code is as follows:
from collections import defaultdict strings = ('puppy', 'kitten', 'puppy', 'puppy', 'weasel', 'puppy', 'kitten', 'puppy') counts = defaultdict(lambda: 0) # 使用lambda来定义简单的函数 for s in strings: counts[s] += 1
Through the above content, you must have understood the usage of the defaultdict class, so how to implement the default value in the defaultdict class What about the function? The key to this is the use of the __missing__() method:
>>> from collections import defaultdict >>> print defaultdict.__missing__.__doc__ __missing__(key) # Called by __getitem__ for missing key; pseudo-code: if self.default_factory is None: raise KeyError(key) self[key] = value = self.default_factory() return value
By looking at the docstring of the __missing__() method, we can see that when using the __getitem__() method to access a non-existent key ( The form dict[key] is actually a simplified form of the __getitem__() method), which calls the __missing__() method to obtain the default value and add the key to the dictionary.
For a detailed introduction to the __missing__() method, please refer to the "Mapping Types — dict" section in the official Python documentation.
Introduced in the document, starting from version 2.5, if a subclass derived from dict defines the __missing__() method, when accessing a non-existent key, dict[key] will call the __missing__() method to obtain default value.
It can be seen from this that although dict supports the __missing__() method, this method does not exist in dict itself. Instead, this method needs to be implemented in the derived subclass. This can be easily verified:
>>> print dict.__missing__.__doc__ Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: type object 'dict' has no attribute '__missing__'
At the same time, we can do further experiments, define a subclass Missing and implement the __missing__() method:
>>> class Missing(dict): ... def __missing__(self, key): ... return 'missing' ... >>> d = Missing() >>> d {} >>> d['foo'] 'missing' >>> d {}
The return result reflects the __missing__( ) method does work. On this basis, we slightly modify the __missing__() method so that this subclass sets a default value for non-existent keys like the defautldict class:
>>> class Defaulting(dict): ... def __missing__(self, key): ... self[key] = 'default' ... return 'default' ... >>> d = Defaulting() >>> d {} >>> d['foo'] 'default' >>> d {'foo': 'default'}
First of all, the __getitem__() method needs to call the __missing__() method when the access key fails:
class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key)
Secondly, the __missing__()
method needs to be implemented to set the default value:
class defaultdict(dict): def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
Then, the initialization function of the defaultdict class __init__()
needs to accept type or callable function parameters:
class defaultdict(dict): def __init__(self, default_factory=None, *a, **kw): dict.__init__(self, *a, **kw) self.default_factory = default_factory def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
最后,综合以上内容,通过以下方式完成兼容新旧Python版本的代码:
try: from collections import defaultdictexcept ImportError: class defaultdict(dict): def __init__(self, default_factory=None, *a, **kw): dict.__init__(self, *a, **kw) self.default_factory = default_factory def __getitem__(self, key): try: return dict.__getitem__(self, key) except KeyError: return self.__missing__(key) def __missing__(self, key): self[key] = value = self.default_factory() return value
The above is the detailed content of Detailed explanation of defaultdict in Python (code example). For more information, please follow other related articles on the PHP Chinese website!