In Python, how fast can the expression 1000000000000000 in range(1000000000000001) be executed?
The simplest and crudest way to determine whether an element x returns true, its time complexity is O(n), the theoretical time complexity of using the hash algorithm is O(1), and the time complexity of binary search is O(log n), then what will Python use? Which algorithm is used to achieve this?
Let’s do an experiment first:
#python2 timeit.timeit('1000000000 in range(0,1000000000,10)', number=1) 5.50357640805305 timeit.timeit('1000000000 in xrange(0,1000000000,10)', number=1) 2.3025200839183526 # python3 import timeit timeit.timeit('1000000000 in range(0,1000000000,10)', number=1) 4.490355838248402e-06
We all know that the range function in python2 returns a list object, loading all elements into memory at one time, so execute the first expression When executing, the system will suddenly feel very stuck, and it takes more than 5 seconds.
xrange is similar to the range function in python3, both return an iterator object, but their execution results are very different, which is surprising. The third expression takes nearly 0 seconds. Why is there such a big difference between xrange in python2 and range function in python3? In order to understand the mystery, we need to understand how the in operation is performed. According to the rules of Python documentation in:
If the class implements the contains() method, then as long as y.contains(x) returns true then x in y also returns true, and vice versa.
The contains() method is not implemented, but the iter() method is implemented. Then during the iteration process, if there is a certain value z==x, it will return true, otherwise it will be false.
If neither of the above two methods are implemented, look at the getitem() method. If there is an index i such that x==y[i], return true, otherwise return false.
After understanding the rules of in, let's first take a look at the methods provided by xrange:
dir(xrange) ['__class__','__getitem__', '__hash__', '__init__', '__iter__', '__len__', '__new__', ...]
Yes, the xrange function only implements getitem and iter to determine whether x is needed in y The comparison is performed iteratively value by value, which means that the time complexity of xrange is O(n).
Let’s take a look at the range methods of python3:
dir(range) ['__class__', '__contains__', '__getitem__', '__iter__', 'count', 'index', 'start', 'step', 'stop', ...]
Range provides many more attributes than xrange. It not only implements getitem and iter, but also implements contains, so it will be called first contains method, in addition, it also provides three attributes start, stop, step. So why exactly does it execute so fast? Let’s take a look at how the contains method is implemented.
In Python3, contains does not iterate and compare values one by one, but adopts such a logic:
First check whether x is between the start and stop ranges: start <= x < stop
If it is within this interval, then calculate according to step whether x happens to fall on a certain value in the xrange interval. Here we use the modulo method to determine: - start) % step == 0
The truth is revealed at this moment, the time complexity of xrange is O(1), that is to say, no matter how big the stop value in xrange(start, stop, step) is, the time complexity It's all a constant. Therefore, the range method in python3 can not only save memory, but also perform more efficiently, so don’t worry about whether to learn Python2 or Python3.
The above is the detailed content of Why is 1000000000000000 in range(1000000000000001) so fast in Python3. For more information, please follow other related articles on the PHP Chinese website!