Introduction to Python’s built-in module collections
Collections is a built-in collection module in Python that provides many useful Collection class.
1. namedtuple
Python provides many very useful basic types, such as the immutable type tuple, which we can easily use to represent a binary vector .
Recommended learning: Python video tutorial
>>> v = (2,3)
We found that although (2,3) represents the two coordinates of a vector, if there is no additional explanation , and it is difficult to directly see that this tuple is used to represent a coordinate.
Defining a class for this is a big deal. At this time, namedtuple comes in handy.
>>> from collections import namedtuple >>> Vector = namedtuple('Vector', ['x', 'y']) >>> v = Vector(2,3) >>> v.x 2 >>> v.y 3
namedtuple is a function that creates a custom tuple object and specifies the number of tuple elements, and can use attributes instead of indexes to reference an element of the tuple.
In this way, we can use namedtuple to easily define a data type, which has the invariance of tuple and can be referenced based on attributes, making it very convenient to use.
We can verify the type of the created Vector object.
>>> type(v) <class '__main__.Vector'> >>> isinstance(v, Vector) True >>> isinstance(v, tuple) True
Similarly, if you want to use coordinates and radius to represent a circle, you can also use namedtuple to define:
>>> Circle = namedtuple('Circle', ['x', 'y', 'r']) # namedtuple('名称', [‘属性列表’])
2, deque
In the data In the structure, we know that queue and stack are two very important data types, one is first in first out and the other is last in first out. In python, when using a list to store data, accessing elements by index is very fast, but inserting and deleting elements is very slow, because the list is linear storage, and when the amount of data is large, the efficiency of insertion and deletion is very low.
Deque is a doubly linked list structure for efficient implementation of insertion and deletion operations. It is very suitable for implementing data structures such as queues and stacks.
>>> from collections import deque >>> deq = deque([1, 2, 3]) >>> deq.append(4) >>> deq deque([1, 2, 3, 4]) >>> deq.appendleft(5) >>> deq deque([5, 1, 2, 3, 4]) >>> deq.pop() 4 >>> deq.popleft() 5 >>> deq deque([1, 2, 3])
In addition to implementing append() and pop() of list, deque also supports appendleft() and popleft(), so that you can add or delete elements to the head very efficiently.
3. defaultdict
When using the dict dictionary type, if the referenced key does not exist, KeyError will be thrown. If you want a default value to be returned when the Key does not exist, you can use defaultdict.
>>> from collections import defaultdict >>> dd = defaultdict(lambda: 'defaultvalue') >>> dd['key1'] = 'a' >>> dd['key1'] 'a' >>> dd['key2'] # key2未定义,返回默认值 'defaultvalue'
Note that the default value is returned by calling the function, and the function is passed in when creating the defaultdict object.
Except for returning the default value when the Key does not exist, the other behaviors of defaultdict are exactly the same as dict.
4. OrderedDict
When using dict, the keys are unordered. When iterating over dict, we cannot determine the order of keys.
But if you want to keep the order of keys, you can use OrderedDict.
>>> from collections import OrderedDict >>> d = dict([('a', 1), ('b', 2), ('c', 3)]) >>> d # dict的Key是无序的 {'a': 1, 'c': 3, 'b': 2} >>> od = OrderedDict([('a', 1), ('b', 2), ('c', 3)]) >>> od # OrderedDict的Key是有序的 OrderedDict([('a', 1), ('b', 2), ('c', 3)])
Note that the keys of OrderedDict will be arranged in the order of insertion, not the key itself.
>>> od = OrderedDict() >>> od['z'] = 1 >>> od['y'] = 2 >>> od['x'] = 3 >>> list(od.keys()) # 按照插入的Key的顺序返回 ['z', 'y', 'x']
OrderedDict can implement a FIFO (first in, first out) dict. When the capacity exceeds the limit, first Delete the earliest added key.
from collections import OrderedDict class LastUpdatedOrderedDict(OrderedDict): def __init__(self, capacity): super(LastUpdatedOrderedDict, self).__init__() self._capacity = capacity def __setitem__(self, key, value): containsKey = 1 if key in self else 0 if len(self) - containsKey >= self._capacity: last = self.popitem(last=False) print('remove:', last) if containsKey: del self[key] print('set:', (key, value)) else: print('add:', (key, value)) OrderedDict.__setitem__(self, key, value)
5. ChainMap
ChainMap can string together a set of dicts to form a logical dict. ChainMap itself is also a dict, but when searching, it will search the internal dicts in order.
When is it most appropriate to use ChainMap? For example: Applications often need to pass in parameters. Parameters can be passed in through the command line, passed in through environment variables, and can also have default parameters. We can use ChainMap to implement priority search of parameters, that is, first check the command line parameters, if not passed in, then check the environment variables, if not, use the default parameters.
The following code demonstrates how to find the two parameters user and color.
from collections import ChainMap import os, argparse # 构造缺省参数: defaults = { 'color': 'red', 'user': 'guest' } # 构造命令行参数: parser = argparse.ArgumentParser() parser.add_argument('-u', '--user') parser.add_argument('-c', '--color') namespace = parser.parse_args() command_line_args = { k: v for k, v in vars(namespace).items() if v } # 组合成ChainMap: combined = ChainMap(command_line_args, os.environ, defaults) # 打印参数: print('color=%s' % combined['color']) print('user=%s' % combined['user'])
When there are no parameters, print out the default parameters:
$ python3 use_chainmap.py color=red user=guest
When the command line parameters are passed in, the command line parameters are used first:
$ python3 use_chainmap.py -u bob color=red user=bob
At the same time, the command line is passed in Parameters and environment variables, command line parameters have higher priority:
$ user=admin color=green python3 use_chainmap.py -u bob color=green user=bob
6, Counter
Counter is a simple counter, for example, counting the number of occurrences of characters Count:
from collections import Counter >>> s = 'abbcccdddd' >>> Counter(s) Counter({'d': 4, 'c': 3, 'b': 2, 'a': 1})
Counter is actually a subclass of dict.
7. Summary
The collections module provides some useful collection classes, which can be selected as needed.
The above is the detailed content of Introduction to python built-in module collections. For more information, please follow other related articles on the PHP Chinese website!