Several methods and performance comparison of python to obtain the corresponding positions of letters in the alphabet
In some cases, we are required to find out the order of letters in the alphabet, A = 1, B = 2, C = 3, and so on. , for example, this question https://projecteuler.net/problem=42 One of the steps to solve the problem is to convert the letters into the corresponding order in the alphabet.
The easiest way to get the corresponding position of letters in the alphabet is:
Use str.index or str.find method:
In [137]: "ABC".index('B') Out[137]: 1In [138]: "ABC".index('B')+1Out[138]: 2#或者在前面填充一个字符,这样index就直接得到字母序号: In [139]: "_ABC".index("B") Out[139]: 2
I also thought of converting the alphabet into a list or tuple and then indexing, performance Or will it improve? Or is it a good idea to store the key values composed of letters and numbers into a dictionary?
I also realized a method two days ago:
In [140]: ord('B')-64 Out[140]: 2
ord and chr are both built-in functions in python. ord can convert ASCII characters into serial numbers corresponding to the ASCII table, and chr can convert serial numbers into string.
The uppercase letters start from 65 in the table. Subtracting 64 is exactly the position of the uppercase letters in the table. Lowercase letters start at 97, and anything less than 96 is the corresponding alphabet position.
Which approach might be better in terms of performance? I wrote the code to test it:
az = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"_az = "_ABCDEFGHIJKLMNOPQRSTUVWXYZ"azlist = list(az) azdict = dict(zip(az,range(1,27))) text = az*1000000 #这个是测试数据#str.find和str.index的是一样的。这里就没必要写了。def azindexstr(text): for r in text: az.index(r)+1 passdef _azindexstr(text): for r in text: _az.index(r) passdef azindexlist(text): for r in text: azlist.index(r) passdef azindexdict(text): for r in text: azdict.get(r) passdef azindexdict2(text): for r in text: azdict[r] passdef azord(text): for r in text: ord(r)-64 passdef azand64(text): for r in text: ord(r)%64 pass
Copy and paste the above code into ipython, and then use the magic function %timeit to test the performance of each method. ipython is a python interactive interpreter that comes with a variety of very practical functions, such as the %timeit function that is mainly used for text. Please enter pip install ipython to install.
The following is the result data of my test:
In [147]: %timeit azindexstr(text) 1 loop, best of 3: 9.09 s per loop In [148]: %timeit _azindexstr(text) 1 loop, best of 3: 8.1 s per loop In [149]: %timeit azindexlist(text) 1 loop, best of 3: 17.1 s per loop In [150]: %timeit azindexdict(text) 1 loop, best of 3: 4.54 s per loop In [151]: %timeit azindexdict2(text) 1 loop, best of 3: 1.99 s per loop In [152]: %timeit azord(text) 1 loop, best of 3: 2.94 s per loop In [153]: %timeit azand64(text) 1 loop, best of 3: 4.56 s per loop
From the results, it can be seen that list.index is the slowest, which surprised me. In addition, if there is a lot of data in the list, the index will be very slow. The speed of dict[r] is faster than that of dict.get(r), but if it is a non-existent key, dict[r] will report an error, while the dict.get method will not report an error and has better fault tolerance.
ord(r)-64’s method is fast and should be the most convenient to use, as there is no need to construct data.