About the subscript problem of Sequence slicing and its solution

零下一度
Release: 2017-06-17 11:00:02
Original
1159 people have browsed it

This article mainly introduces to you the relevant information about the Sequence slice subscript problem in Python. The article introduces it in detail through the example code, which has certain reference and learning value for everyone. Friends who need it Let’s take a look together below.

Preface

In python, slicing is a frequently used syntax, whether it is a tuple, a list or String, the general syntax is:

sequence[ilow:ihigh:step] # ihigh, step can be empty; for the sake of simplicity and ease of understanding, the usage of step is temporarily excluded Consider

Let’s briefly demonstrate the usage


sequence = [1,2,3,4,5]
sequence [ilow:ihigh] # 从ilow开始到ihigh-1结束
sequence [ilow:]  # 从ilow开始直到末尾
sequence [:ihigh]  # 从头部开始直到ihigh结束
sequence [:]   # 复制整个列表
Copy after login

The syntax is very concise and easy to understand. This syntax It is simple and easy to use in our daily use, but I believe that when we use this slicing syntax, we will habitually follow some rules:

  • ilow, ihigh are both smaller than sequence. Length

  • ilow < ihigh

Because in most cases, only by following the above rules can we get what we expected Result! But what if I don't follow it? What happens to slicing?

No matter we are using tuples, lists or strings, when we want to get an element, we will use the following syntax:


sequence = [1,2,3,4,5]
print sequence[1] # 输出2
print sequence[2] # 输出3
Copy after login

Let’s call the 1 and 2 that appear above subscripts. Whether it is a tuple, a list or a string, we can use the subscript to get the corresponding value, but If the subscript exceeds the length of the object, an index exception (IndexError) will be triggered


sequence = [1,2,3,4,5]
print sequence[15] 

### 输出 ###
Traceback (most recent call last):
 File "test.py", line 2, in <module>
 print a[20]
IndexError: list index out of range
Copy after login

So what about slicing? The two syntaxes are very similar, assuming that ilow and ihigh are respectively 10 and 20, then what is the result?

Reappearance of the scene


##

# version: python2.7

a = [1, 2, 3, 5]
print a[10:20] # 结果会报异常吗?
Copy after login

Look To 10 and 20, it completely exceeds the length of sequence a. Due to the previous code or previous experience, we always feel that this will definitely cause an IndexError, so let's open the terminal and test it:


>>> a = [1, 2, 3, 5]
>>> print a[10:20]
[]
Copy after login

The result is actually: [], which feels a bit interesting. Only lists can do this, what about strings, what about tuples?


>>> s = &#39;23123123123&#39;
>>> print s[400:2000]
&#39;&#39;
>>> t = (1, 2, 3,4)
>>> print t[200: 1000]
()
Copy after login

The results are similar to those of the list, returning their own empty results.

We shed tears when we saw the results. Instead of returning an IndexError, we directly returned empty. This made us think that, in fact, the syntax Similar, the things behind it are definitely different, so let’s try to explain the results together

Principle Analysis
# #Before we reveal it, we must first figure out how python handles this slice. We can use the dis module to help:

############# 切片 ################
[root@iZ23pynfq19Z ~]# cat test.py
a = [11,2,3,4]
print a[20:30]

#结果:
[root@iZ23pynfq19Z ~]# python -m dis test.py 
 1   0 LOAD_CONST    0 (11)
    3 LOAD_CONST    1 (2)
    6 LOAD_CONST    2 (3)
    9 LOAD_CONST    3 (4)
    12 BUILD_LIST    4
    15 STORE_NAME    0 (a)

 2   18 LOAD_NAME    0 (a)
    21 LOAD_CONST    4 (20)
    24 LOAD_CONST    5 (30)
    27 SLICE+3    
    28 PRINT_ITEM   
    29 PRINT_NEWLINE  
    30 LOAD_CONST    6 (None)
    33 RETURN_VALUE 

############# 单下标取值 ################
[root@gitlab ~]# cat test2.py
a = [11,2,3,4]
print a[20]

#结果:
[root@gitlab ~]# python -m dis test2.py
 1   0 LOAD_CONST    0 (11)
    3 LOAD_CONST    1 (2)
    6 LOAD_CONST    2 (3)
    9 LOAD_CONST    3 (4)
    12 BUILD_LIST    4
    15 STORE_NAME    0 (a)

 2   18 LOAD_NAME    0 (a)
    21 LOAD_CONST    4 (20)
    24 BINARY_SUBSCR  
    25 PRINT_ITEM   
    26 PRINT_NEWLINE  
    27 LOAD_CONST    5 (None)
    30 RETURN_VALUE
Copy after login

Here is a brief introduction to the dis module , experienced old drivers all know that when Python interprets a script, there is also a compilation process. The result of the compilation is the pyc file we often see, which contains the bytecode composed of code

object object

, and dis displays these bytecodes in a more impressive way, allowing us to see the execution process. The following is an explanation of the output columns of dis:

    The first column is a number is the line number of the original source code.
  • The second column is the offset of the bytecode: LOAD_CONST is at line 0. And so on.
  • The third column is the human-readable name of the bytecode. They are prepared for programmers
  • The fourth column represents the parameters of the instruction
  • The fifth column is the actual parameters after calculation

  • I won’t go into details before, it is the process of reading constants and storing
variables

. The main difference is: test.py slices use bytecode SLICE +3 is implemented, and the test2.py single subscript value is mainly implemented through the bytecode BINARY_SUBSCR. As we guessed, similar syntax is completely different code. Because what we want to discuss is slicing (SLICE+ 3), so we will not expand BINARY_SUBSCR anymore. Interested children can check the relevant source code to learn about the specific implementation. Location: python/object/ceval.c Then let’s discuss SLICE+3

/*取自: python2.7 python/ceval.c */

// 第一步: 
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
{
  .... // 省略n行代码
  TARGET_WITH_IMPL_NOARG(SLICE, _slice)
  TARGET_WITH_IMPL_NOARG(SLICE_1, _slice)
  TARGET_WITH_IMPL_NOARG(SLICE_2, _slice)
  TARGET_WITH_IMPL_NOARG(SLICE_3, _slice)
  _slice:
  {
   if ((opcode-SLICE) & 2)
    w = POP();
   else
    w = NULL;
   if ((opcode-SLICE) & 1)
    v = POP();
   else
    v = NULL;
   u = TOP();
   x = apply_slice(u, v, w); // 取出v: ilow, w: ihigh, 然后调用apply_slice
   Py_DECREF(u);
   Py_XDECREF(v);
   Py_XDECREF(w);
   SET_TOP(x);
   if (x != NULL) DISPATCH();
   break;
  }

 .... // 省略n行代码
}

// 第二步:
apply_slice(PyObject *u, PyObject *v, PyObject *w) /* return u[v:w] */
{
 PyTypeObject *tp = u->ob_type;  
 PySequenceMethods *sq = tp->tp_as_sequence;

 if (sq && sq->sq_slice && ISINDEX(v) && ISINDEX(w)) { // v,w的类型检查,要整型/长整型对象
  Py_ssize_t ilow = 0, ihigh = PY_SSIZE_T_MAX;
  if (!_PyEval_SliceIndex(v, &ilow))    // 将v对象再做检查, 并将其值转换出来,存给ilow
   return NULL;
  if (!_PyEval_SliceIndex(w, &ihigh))    // 同上
   return NULL;
  return PySequence_GetSlice(u, ilow, ihigh);  // 获取u对象对应的切片函数
 }
 else {
  PyObject *slice = PySlice_New(v, w, NULL);
  if (slice != NULL) {
   PyObject *res = PyObject_GetItem(u, slice);
   Py_DECREF(slice);
   return res;
  }
  else
   return NULL;
 }

// 第三步:
PySequence_GetSlice(PyObject *s, Py_ssize_t i1, Py_ssize_t i2)
{
 PySequenceMethods *m;
 PyMappingMethods *mp;

 if (!s) return null_error();

 m = s->ob_type->tp_as_sequence;
 if (m && m->sq_slice) {
  if (i1 < 0 || i2 < 0) {
   if (m->sq_length) {
    // 先做个简单的初始化, 如果左右下表小于, 将其加上sequence长度使其归为0
    Py_ssize_t l = (*m->sq_length)(s);
    if (l < 0)
     return NULL;
    if (i1 < 0)
     i1 += l;
    if (i2 < 0)
     i2 += l;
   }
  }
  // 真正调用对象的sq_slice函数, 来执行切片的操作
  return m->sq_slice(s, i1, i2);
 } else if ((mp = s->ob_type->tp_as_mapping) && mp->mp_subscript) {
  PyObject *res;
  PyObject *slice = _PySlice_FromIndices(i1, i2);
  if (!slice)
   return NULL;
  res = mp->mp_subscript(s, slice);
  Py_DECREF(slice);
  return res;
 }

 return type_error("&#39;%.200s&#39; object is unsliceable", s);
Copy after login

Although the above code is a bit long, the key places have been commented out, and we only need to pay attention to those places. As above, we know that we must eventually execute

m->sq_slice(s, i1, i2)

, but this sq_slice is a bit special, because different objects have different corresponding functions. The following are the corresponding functions:

// 字符串对象
StringObject.c: (ssizessizeargfunc)string_slice, /*sq_slice*/

// 列表对象
ListObject.c: (ssizessizeargfunc)list_slice,  /* sq_slice */

// 元组
TupleObject.c: (ssizessizeargfunc)tupleslice,  /* sq_slice */
Copy after login

Because the function implementations of the three of them are roughly the same, we only need to analyze one of them. The following is the analysis of the slicing function of the list:

/* 取自ListObject.c */
static PyObject *
list_slice(PyListObject *a, Py_ssize_t ilow, Py_ssize_t ihigh)
{
 PyListObject *np;
 PyObject **src, **dest;
 Py_ssize_t i, len;
 if (ilow < 0)
  ilow = 0;
 else if (ilow > Py_SIZE(a))    // 如果ilow大于a长度, 那么重新赋值为a的长度
  ilow = Py_SIZE(a);
 if (ihigh < ilow)  
  ihigh = ilow;
 else if (ihigh > Py_SIZE(a))    // 如果ihigh大于a长度, 那么重新赋值为a的长度 
  ihigh = Py_SIZE(a);
 len = ihigh - ilow;
 np = (PyListObject *) PyList_New(len); // 创建一个ihigh - ilow的新列表对象
 if (np == NULL)
  return NULL;

 src = a->ob_item + ilow;
 dest = np->ob_item;
 for (i = 0; i < len; i++) {    // 将a处于该范围内的成员, 添加到新列表对象
  PyObject *v = src[i];
  Py_INCREF(v);
  dest[i] = v;
 }
 return (PyObject *)np;
}
Copy after login

in conclusion

As can be seen from the slicing function corresponding to the sq_slice function above, if when using slicing, the left and right subscripts are greater than the length of the sequence, they will be reassigned to the length of the sequence, so our initial slicing: print a[10:20] , what actually runs is: print a4:4 . Through this analysis, in the future when you encounter a slice with a subscript greater than the object length, you should I won’t be confused anymore~

The above is the detailed content of About the subscript problem of Sequence slicing and its solution. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template