What this article brings to you is how to execute local variables of Python functions? A brief analysis of the application of Python function variables has certain reference value. Friends in need can refer to it. I hope it will be helpful to you.
Preface
When I was in CodeReview these two days, I saw this code
# 伪代码 import somelib class A(object): def load_project(self): self.project_code_to_name = {} for project in somelib.get_all_projects(): self.project_code_to_name[project] = project ...
The intention is very simple, that is, somelib.get_all_projects
The obtained project was inserted into self.project_code_to_name
However, I had the impression that there was room for optimization, so I proposed an adjustment plan:
import somelib class A(object): def load_project(self): project_code_to_name = {} for project in somelib.get_all_projects(): project_code_to_name[project] = project self.project_code_to_name = project_code_to_name ...
The plan is very simple. That is to first define the local variable project_code_to_name
, and then assign it to self.project_code_to_name
after the operation is completed.
In subsequent tests, I did find that this would be better. Now that the results are known, I definitely want to explore the reasons next!
Local variables
In fact, there is a point of view mentioned in many places on the Internet and even in many books: Accessing local variables is much faster , it seems to make sense at first glance, and then I saw a lot of test data posted below. Although I don’t know what it is, it’s really cool. Remember it, don’t worry!
But in fact, this view still has certain limitations and is not universally applicable. So let’s first understand this sentence and why everyone likes to say it.
First look at the code to understand what local variables are:
#coding: utf8 a = 1 def test(b): c = 'test' print a # 全局变量 print b # 局部变量 print c # 局部变量 test(3)
# 输出 1 3 test
简单来说,局部变量就是只作用于所在的函数域,超过作用域就被回收
To understand what local variables are, you need to talk about the love and hate between Python functions and local variables, because if you don’t Knowing this, it is difficult to feel where the speed is;
To avoid boredom, let’s explain it with the above code, and by the way, attach the test function execution dis## Analysis of #:
# CALL_FUNCTION 5 0 LOAD_CONST 1 ('test') 3 STORE_FAST 1 (c) 6 6 LOAD_GLOBAL 0 (a) 9 PRINT_ITEM 10 PRINT_NEWLINE 7 11 LOAD_FAST 0 (b) 14 PRINT_ITEM 15 PRINT_NEWLINE 8 16 LOAD_FAST 1 (c) 19 PRINT_ITEM 20 PRINT_NEWLINE 21 LOAD_CONST 0 (None) 24 RETURN_VALUE
LOAD_XXX. As the name implies, it is Explain where these variables are obtained from.
LOAD_GLOBAL There is no doubt that it is global, but what the hell is
LOAD_FAST? It seems like it should be called
LOAD_LOCAL, right?
LOAD_FAST, because the local variables are read from an array called
fastlocals, so the name is like this Yelled (I guess).
Python function execution
The construction and operation of Python functions can be said to be complicated or not, and it is not simple to say it is simple, because it needs to distinguish many situations, for example, it needs to distinguish Functions and methods are further distinguished by whether they have parameters, what parameters they have, whether they have variable length parameters, and whether they have key parameters. It is impossible to explain everything in detail, but you can briefly illustrate the general process (ignoring the details of parameter changes):Go all the way down to
fast_function, its call here is:
// ceval.c -> call_function x = fast_function(func, pp_stack, n, na, nk);
test;
Initialize a wave
The number of positional parameters passed in == the positional parameters when the function is defined Number &&
If no keyword parameters are passed inthen
co,
globals to create a new stack object
f;
We know that this step is performed in
CALL_FUNCTION
12 27 LOAD_NAME 2 (test) 30 LOAD_CONST 4 (3) 33 CALL_FUNCTION 1 36 POP_TOP 37 LOAD_CONST 1 (None) 40 RETURN_VALUE
You can see
30 LOAD_CONST on CALL_FUNCTION, for those who are interested You can try passing a few more parameters, and you will find that the parameters passed in are loaded sequentially through LOAD_CONST
, so the problem of how to find parameters becomes obvious; <div class="code" style="position:relative; padding:0px; margin:0px;"><pre class="brush:php;toolbar:false">// fast_function 函数
fastlocals = f->f_localsplus;
stack = (*pp_stack) - n;
for (i = 0; i <p>这里出现的 n 还记得怎么来的吗?回顾上面有个 <code>n = na + 2 * nk;</code> ,能想起什么吗?</p><p>其实这个地方就是简单的通过将 <code>pp_stack</code> 偏移 n 字节 找到一开始塞入参数的位置。</p><p>那么问题来了,如果 n 是 位置参数个数 + 关键字参数,那么 2 * nk 是什么意思?其实这答案很简单,那就是 关键字参数字节码 是属于带参数字节码, 是占 2字节。</p><p>到了这里,栈对象 <code>f</code> 的 <code>f_localsplus</code> 也登上历史舞台了,只是此时的它,还只是一个未经人事的少年,还需历练。</p><p>做好这些动作,终于来到真正执行函数的地方了: <code>PyEval_EvalFrameEx</code>,在这里,需要先交代下,有个和 <code>PyEval_EvalFrameEx</code> 很像的,叫 <code>PyEval_EvalCodeEx</code>,虽然长得像,但是人家干得活更多了。</p><p>请看回前面的 <code>fast_function</code> 开始那会有个判断,我们上面说得是判断成立的,也就是最简单的函数执行情况。如果函数传入多了关键字参数或者其他情况,那就复杂很多了,此时就需要由 <code>PyEval_EvalCodeEx</code> 处理一波,再执行 <code>PyEval_EvalFrameEx</code>。</p><p><code>PyEval_EvalFrameEx</code> 主要的工作就是解析字节码,像刚才的那些 <code>CALL_FUNCTION</code>,<code>LOAD_FAST</code> 等等,都是由它解析和处理的,它的本质就是一个死循环,然后里面有一堆 <code>swith - case</code>,这基本也就是 Python 的运行本质了。</p><h4>f_localsplus 存 和 取</h4><p>讲了这么长的一堆,算是把 Python 最基本的 函数调用过程简单扫了个盲,现在才开始探索主题。。</p><p>为了简单阐述,直接引用名词:<code>fastlocals</code>, 其中 <code>fastlocals = f->f_localsplus</code></p><p>刚才只是简单看到了,Python 会把传入的参数,以此塞入 <code>fastlocals</code> 里面去,那么毋庸置疑,传入的位置参数,必然属于局部变量了,那么关键字参数呢?那肯定也是局部变量,因为它们都被特殊对待了嘛。</p><p>那么除了函数参数之外,必然还有函数内部的赋值咯? 这块字节码也一早在上面给出了:</p><pre class="brush:php;toolbar:false"># CALL_FUNCTION
5 0 LOAD_CONST 1 ('test')
3 STORE_FAST 1 (c)</pre><div class="contentsignin">Copy after login</div></div><p>这里出现了新的字节码 <code>STORE_FAST
,一起来看看实现把:
# PyEval_EvalFrameEx 庞大 switch-case 的其中一个分支: PREDICTED_WITH_ARG(STORE_FAST); TARGET(STORE_FAST) { v = POP(); SETLOCAL(oparg, v); FAST_DISPATCH(); } # 因为有涉及到宏,就顺便给出: #define GETLOCAL(i) (fastlocals[i]) #define SETLOCAL(i, value) do { PyObject *tmp = GETLOCAL(i); \ GETLOCAL(i) = value; \ Py_XDECREF(tmp); } while (0)
简单解释就是,将 POP() 获得的值 v,塞到 fastlocals 的 oparg 位置上。此处,v 是 "test", oparg 就是 1。用图表示就是:
有童鞋可能会突然懵了,为什么突然来了个 b
?我们又需要回到上面看 test 函数是怎样定义的:
// 我感觉往回看的概率超低的,直接给出算了 def test(b): c = 'test' print b # 局部变量 print c # 局部变量
看到函数定义其实都应该知道了,因为 b
是传的参数啊,老早就塞进去了~
那存储知道了,那么怎么取呢?同样也是这段代码的字节码:
22 LOAD_FAST 1 (c)
虽然这个用脚趾头想想都知道原理是啥,但公平起见还是给出相应的代码:
# PyEval_EvalFrameEx 庞大 switch-case 的其中一个分支: TARGET(LOAD_FAST) { x = GETLOCAL(oparg); if (x != NULL) { Py_INCREF(x); PUSH(x); FAST_DISPATCH(); } format_exc_check_arg(PyExc_UnboundLocalError, UNBOUNDLOCAL_ERROR_MSG, PyTuple_GetItem(co->co_varnames, oparg)); break; }
直接用 GETLOCAL
通过索引在数组里取值了。
到了这里,应该也算是把 f_localsplus
讲明白了。这个地方不难,其实一般而言是不会被提及到这个,因为一般来说忽略即可了,但是如果说想在性能方面讲究点,那么这个小知识就不得忽视了。
变量使用姿势
因为是面向对象,所以我们都习惯了通过 class
的方式,对于下面的使用方式,也是随手就来:
class SS(object): def __init__(self): self.fuck = {} def test(self): print self.fuck
这种方式一般是没什么问题的,也很规范。到那时如果是下面的操作,那就有问题了:
class SS(object): def __init__(self): self.fuck = {} def test(self): num = 10 for i in range(num): self.fuck[i] = i
这段代码的性能损耗,会随着 num 的值增大而增大, 如果下面循环中还要涉及到更多类属性的读取、修改等等,那影响就更大了
这个类属性如果换成 全局变量,也会存在类似的问题,只是说在操作类属性会比操作全局变量要频繁得多。
我们直接看看两者的差距有多大把?
import timeit class SS(object): def test(self): num = 100 self.fuck = {} # 为了公平,每次执行都同样初始化新的 {} for i in range(num): self.fuck[i] = i def test_local(self): num = 100 fuck = {} # 为了公平,每次执行都同样初始化新的 {} for i in range(num): fuck[i] = i self.fuck = fuck s = SS() print timeit.timeit(stmt=s.test_local) print timeit.timeit(stmt=s.test)
通过上图可以看出,随着 num 的值越大,for 循环的次数就越多,那么两者的差距也就越大了。
那么为什么会这样,也是在字节码可以看出写端倪:
// s.test >> 28 FOR_ITER 19 (to 50) 31 STORE_FAST 2 (i) 8 34 LOAD_FAST 2 (i) 37 LOAD_FAST 0 (self) 40 LOAD_ATTR 0 (hehe) 43 LOAD_FAST 2 (i) 46 STORE_SUBSCR 47 JUMP_ABSOLUTE 28 >> 50 POP_BLOCK // s.test_local >> 25 FOR_ITER 16 (to 44) 28 STORE_FAST 3 (i) 14 31 LOAD_FAST 3 (i) 34 LOAD_FAST 2 (hehe) 37 LOAD_FAST 3 (i) 40 STORE_SUBSCR 41 JUMP_ABSOLUTE 25 >> 44 POP_BLOCK 15 >> 45 LOAD_FAST 2 (hehe) 48 LOAD_FAST 0 (self) 51 STORE_ATTR 1 (hehe)
上面两段就是两个方法的 for block
内容,大家对比下就会知道, s.test
相比于 s.test_local
, 多了个 LOAD_ATTR
放在 FOR_ITER
和 POP_BLOCK
之间。
这说明什么呢? 这说明,在每次循环时,s.test
都需要 LOAD_ATTR
,很自然的,我们需要看看这个是干什么的:
TARGET(LOAD_ATTR) { w = GETITEM(names, oparg); v = TOP(); x = PyObject_GetAttr(v, w); Py_DECREF(v); SET_TOP(x); if (x != NULL) DISPATCH(); break; } # 相关宏定义 #define GETITEM(v, i) PyTuple_GetItem((v), (i))
这里出现了一个陌生的变量 name
, 这是什么?其实这个就是每个 codeobject 所维护的一个 名字数组,基本上每个块所使用到的字符串,都会在这里面存着,同样也是有序的:
// PyCodeObject 结构体成员 PyObject *co_names; /* list of strings (names used) */
那么 LOAD_ATTR
的任务就很清晰了:先从名字列表里面取出字符串,结果就是 "hehe", 然后通过 PyObject_GetAttr 去查找,在这里就是在 s 实例中去查找。
且不说查找效率如何,光多了这一步,都能失之毫厘差之千里了,当然这是在频繁操作次数比较多的情况下。
所以我们在一些会频繁操作 类/实例属性
的情况下,应该是先把 属性
取出来存到 局部变量
,然后用 局部变量
来完成操作。最后视情况把变动更新到属性
上。
最后
其实相比变量,在函数和方法的使用上面更有学问,更值得探索,因为那个原理和表面看起来差别更大,下次有机会再探讨。平时工作多注意下,才能使得我们的 PY 能够稍微快点点点点点。
相关推荐:
The above is the detailed content of How are Python function local variables executed? A brief analysis of the application of python function variables. For more information, please follow other related articles on the PHP Chinese website!