The content comes from "Extending and Embedding PHP" - Chapter 3 - Memory Management, plus my own understanding, to make a translation of the reference counting of variables in PHP, copy-on-write, change-on-write, copy-on-write and change ".
Before reading the following content, first have an understanding of the zval structure
<code class="hljs thrift" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"><span class="hljs-keyword" style="color: rgb(0, 0, 255);">typedef</span> <span class="hljs-class"><span class="hljs-keyword" style="color: rgb(0, 0, 255);">struct</span> _<span class="hljs-title" style="color: rgb(163, 21, 21);">zval_struct</span> </span>{ zvalue_value value; zend_uint refcount; zend_uchar type; zend_uchar is_ref; } zval;</code>
There are 4 elements in the zval structure. value is a union used to actually store the value of zval. refcount is used to count how many variables the zval is used. type represents the data type stored in zval. is_ref is used To mark whether the zval is referenced.
<code class="hljs xml" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"><span class="php"><span class="hljs-preprocessor" style="color: rgb(43, 145, 175);"><?php</span> <span class="hljs-variable">$a</span> = <span class="hljs-string" style="color: rgb(163, 21, 21);">'Hello World'</span>; <span class="hljs-variable">$b</span> = <span class="hljs-variable">$a</span>; <span class="hljs-keyword" style="color: rgb(0, 0, 255);">unset</span>(<span class="hljs-variable">$a</span>); <span class="hljs-preprocessor" style="color: rgb(43, 145, 175);">?></span></span></code>
Let’s analyze the above code together:
$a = 'Hello World';
First this code is executed, the kernel creates a variable and allocates 12 bytes of memory to store the string 'Hello World' and the NULL at the end. $b = $a;
Then execute this code. What happens in the kernel when executing this sentence?
$a
. points the variable $b
to the zval pointed to by $a
.
This is probably the case in the kernel, where active_symbol_table
is the current variable symbol table
<code class="hljs clojure" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"> <span class="hljs-collection">{ zval *helloval; MAKE_STD_ZVAL<span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">helloval</span>)</span><span class="hljs-comment" style="color: green;">;</span> ZVAL_STRING<span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">helloval</span>, <span class="hljs-string" style="color: rgb(163, 21, 21);">"Hello World"</span>, <span class="hljs-number">1</span>)</span><span class="hljs-comment" style="color: green;">;</span> zend_hash_add<span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">EG</span><span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">active_symbol_table</span>)</span>, <span class="hljs-string" style="color: rgb(163, 21, 21);">"a"</span>, sizeof<span class="hljs-list">(<span class="hljs-string" style="color: rgb(163, 21, 21);">"a"</span>)</span>, &helloval, sizeof<span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">zval*</span>)</span>, NULL)</span><span class="hljs-comment" style="color: green;">;</span> ZVAL_ADDREF<span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">helloval</span>)</span><span class="hljs-comment" style="color: green;">;</span> zend_hash_add<span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">EG</span><span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">active_symbol_table</span>)</span>, <span class="hljs-string" style="color: rgb(163, 21, 21);">"b"</span>, sizeof<span class="hljs-list">(<span class="hljs-string" style="color: rgb(163, 21, 21);">"b"</span>)</span>, &helloval, sizeof<span class="hljs-list">(<span class="hljs-keyword" style="color: rgb(0, 0, 255);">zval*</span>)</span>, NULL)</span><span class="hljs-comment" style="color: green;">;</span> }</span></code>
unset($a);
After this code is executed, the kernel will 🎜>Yes
<code class="hljs xml" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"><span class="php"><span class="hljs-preprocessor" style="color: rgb(43, 145, 175);"><?php</span> <span class="hljs-variable">$a</span> = <span class="hljs-number">1</span>; <span class="hljs-variable">$b</span> = <span class="hljs-variable">$a</span>; <span class="hljs-variable">$b</span> += <span class="hljs-number">5</span>; <span class="hljs-preprocessor" style="color: rgb(43, 145, 175);">?></span></span></code>
上面这段代码执行完之后,一般肯定希望$a=1,$b=6
,但是如果像引用计数那样,$a
和$b
指向相同的zval,修改$b
之后$a
不是也变了?
这个具体是怎么实现的呢,我们一起来看下:
$a = 1;
The kernel creates a zval and allocates 4 bytes to store the number 1. $b = $a;
This step is the same as the second step in reference counting. Point $b
to the same zval as $a
, and add 1 to the reference count value refcount in the zval. $b = 5;
The key is this step. What happens in this step? How to ensure that the modification will not affect $a
.
get_var_and_separete
操作,如果recfount>1,就需要分离就创建新的zval返回,否则直接返回变量所指向的zval,下面看看如何分离产生新的zval。$b
所指向zval一样的zval。$b
所指向的zval中的refcount计数减1。$b
指向新生成的zval。对新生成的zval进行操作,这就是写时复制。
下面看看内核中分离时的主要代码:
<code class="hljs lasso" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"> zval <span class="hljs-subst">*</span>get_var_and_separate(char <span class="hljs-subst">*</span>varname, int varname_len TSRMLS_DC) { zval <span class="hljs-subst">**</span>varval, <span class="hljs-subst">*</span>varcopy; <span class="hljs-keyword" style="color: rgb(0, 0, 255);">if</span> (zend_hash_find(EG(active_symbol_table), varname, varname_len <span class="hljs-subst">+</span> <span class="hljs-number">1</span>, (<span class="hljs-literal">void</span><span class="hljs-subst">**</span>)<span class="hljs-subst">&</span>varval) <span class="hljs-subst">==</span> FAILURE) { <span class="hljs-comment" style="color: green;">/* Variable doesn't actually exist fail out */</span> <span class="hljs-keyword" style="color: rgb(0, 0, 255);">return</span> <span class="hljs-built_in" style="color: rgb(0, 0, 255);">NULL</span>; } <span class="hljs-keyword" style="color: rgb(0, 0, 255);">if</span> ((<span class="hljs-subst">*</span>varval)<span class="hljs-subst">-></span>is_ref <span class="hljs-subst">||</span> (<span class="hljs-subst">*</span>varval)<span class="hljs-subst">-></span>refcount <span class="hljs-subst"><</span> <span class="hljs-number">2</span>) { <span class="hljs-comment" style="color: green;">/* varname is the only actual reference, * or it's a full reference to other variables * either way: no separating to be done */</span> <span class="hljs-keyword" style="color: rgb(0, 0, 255);">return</span> <span class="hljs-subst">*</span>varval; } <span class="hljs-comment" style="color: green;">/* Otherwise, make a copy of the zval* value */</span> MAKE_STD_ZVAL(varcopy); varcopy <span class="hljs-subst">=</span> <span class="hljs-subst">*</span>varval; <span class="hljs-comment" style="color: green;">/* Duplicate any allocated structures within the zval* */</span> zval_copy_ctor(varcopy); <span class="hljs-comment" style="color: green;">/* Remove the old version of varname * This will decrease the refcount of varval in the process */</span> zend_hash_del(EG(active_symbol_table), varname, varname_len <span class="hljs-subst">+</span> <span class="hljs-number">1</span>); <span class="hljs-comment" style="color: green;">/* Initialize the reference count of the * newly created value and attach it to * the varname variable */</span> varcopy<span class="hljs-subst">-></span>refcount <span class="hljs-subst">=</span> <span class="hljs-number">1</span>; varcopy<span class="hljs-subst">-></span>is_ref <span class="hljs-subst">=</span> <span class="hljs-number">0</span>; zend_hash_add(EG(active_symbol_table), varname, varname_len <span class="hljs-subst">+</span> <span class="hljs-number">1</span>, <span class="hljs-subst">&</span>varcopy, sizeof(zval<span class="hljs-subst">*</span>), <span class="hljs-built_in" style="color: rgb(0, 0, 255);">NULL</span>); <span class="hljs-comment" style="color: green;">/* Return the new zval* */</span> <span class="hljs-keyword" style="color: rgb(0, 0, 255);">return</span> varcopy; }</code>
<code class="hljs xml" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"><span class="php"><span class="hljs-preprocessor" style="color: rgb(43, 145, 175);"><?php</span> <span class="hljs-variable">$a</span> = <span class="hljs-number">1</span>; <span class="hljs-variable">$b</span> = &<span class="hljs-variable">$a</span>; <span class="hljs-variable">$b</span> += <span class="hljs-number">5</span>; <span class="hljs-preprocessor" style="color: rgb(43, 145, 175);">?></span></span></code>
上面这段代码执行完之后一般希望是:$a == $b == 6
。这个又是怎么实现的呢?
$a = 1;
This step is the same as the first step in copy-on-write. $b = &$a;
In this step, the kernel will point $b
to the zval pointed to by $a
, increase the refcount in the zval by 1, and set the is_ref in the zval to 1. $b = 5;
This step is the same as the third step in copy-on-write, but what happens in the kernel is different.
$b
has changed, it will also execute the get_var_and_separate function to see if separation is needed. (*varval)->is_ref
is used, it will directly return the zval pointed to by $b
without separating and generating a new zval, regardless of whether the refcount of zval is >1. $b
value, the value of $a
will also change, because they point to the same zval. Now that you are smart, you may have seen something wrong. What if a zval structure has both a refcount count and an is_ref reference?
<code class="hljs xml" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"><span class="php"><span class="hljs-preprocessor" style="color: rgb(43, 145, 175);"><?php</span> <span class="hljs-variable">$a</span> = <span class="hljs-number">1</span>; <span class="hljs-variable">$b</span> = <span class="hljs-variable">$a</span>; <span class="hljs-variable">$c</span> = &<span class="hljs-variable">$a</span>; <span class="hljs-preprocessor" style="color: rgb(43, 145, 175);">?></span></span></code>
如果出现上面这种情况的时候,如果$a、$b、$c
指向同一个zval结构体,进行改变的时候Zend到底去听谁的?其实这个地方不会指向同一个zval了。
如果对一个is_ref = 0 && refcount >1
的zval进行写时改变这种赋值形式(就是引用赋值)的时候,Zend会将等号右边的变量分离出来一个新的zval,
对这个zval进行初始化,对之前的zval的refcount进行减1操作,让等号左边的变量指向这个新的zval,refcount进行加1操作,is_ref=1。看看下面这张图片
<code class="hljs xml" style="font-family: 'Courier New', sans-serif !important; line-height: 1.5 !important; font-size: 12px !important; background-color: rgb(245, 245, 245) !important; border: 1px solid rgb(204, 204, 204) !important; padding: 5px !important; border-top-left-radius: 3px !important; border-top-right-radius: 3px !important; border-bottom-right-radius: 3px !important; border-bottom-left-radius: 3px !important; display: block; overflow-x: auto; color: rgb(0, 0, 0); background-position: initial initial; background-repeat: initial initial;"><span class="php"><span class="hljs-preprocessor" style="color: rgb(43, 145, 175);"><?php</span> <span class="hljs-variable">$a</span> = <span class="hljs-number">1</span>; <span class="hljs-variable">$b</span> = &<span class="hljs-variable">$a</span>; <span class="hljs-variable">$c</span> = <span class="hljs-variable">$a</span>; <span class="hljs-preprocessor" style="color: rgb(43, 145, 175);">?></span></span></code>
上面这又是另外一种情况,在is_ref = 1
的情况下,试图单纯的进行refcount+1操作的时候会分离出来一个新的zval给等号左边的变量,并初始化他,看看下面这张图片
1.《Extending and Embedding PHP》- Chaper 3 - Memory Management.