When it comes to performance optimization of PHP applications, the first thing many people think of is caching. Because general programs do not use PHP to perform too many calculations, there is little room for algorithm optimization, so the performance bottleneck will not appear on the CPU, but more likely on IO. The IO operations that most require caching are time-consuming database queries. The most common cache use case is a query like this:
<span class="k">function</span> <span class="nf">query_author_articles</span><span class="p">(</span><span class="nv">$author</span><span class="p">)</span> <span class="p">{</span> <span class="k">global</span> <span class="nv">$pdo</span><span class="p">,</span><span class="nv">$memcache</span><span class="p">;</span> <span class="nv">$sql</span><span class="o">=</span><span class="s1">'SELECT * FROM `articles` WHERE author=:author ORDER BY `date` DESC'</span><span class="p">;</span> <span class="nv">$cacheKey</span><span class="o">=</span><span class="s2">"author_</span><span class="si">{</span><span class="nv">$author</span><span class="si">}</span><span class="s2">_articles"</span><span class="p">;</span> <span class="nv">$results</span><span class="o">=</span><span class="nv">$memcache</span><span class="o">-></span><span class="na">get</span><span class="p">(</span><span class="nv">$cacheKey</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="k">false</span><span class="o">===</span><span class="nv">$results</span><span class="p">)</span> <span class="p">{</span> <span class="c1">//缓存没有命中则执行查询并将结果保存到memcache中</span> <span class="nv">$sth</span><span class="o">=</span><span class="nv">$pdo</span><span class="o">-></span><span class="na">prepare</span><span class="p">(</span><span class="nv">$sql</span><span class="p">);</span> <span class="nv">$sth</span><span class="o">-></span><span class="na">bindParam</span><span class="p">(</span><span class="s1">':author'</span><span class="p">,</span> <span class="nv">$author</span><span class="p">,</span> <span class="nx">PDO</span><span class="o">::</span><span class="na">PARAM_STR</span><span class="p">);</span> <span class="nv">$sth</span><span class="o">-></span><span class="na">execute</span><span class="p">();</span> <span class="nv">$results</span><span class="o">=</span><span class="nv">$sth</span><span class="o">-></span><span class="na">fetchAll</span><span class="p">(</span><span class="nx">PDO</span><span class="o">::</span><span class="na">FETCH_ASSOC</span><span class="p">);</span> <span class="nv">$memcache</span><span class="o">-></span><span class="na">set</span><span class="p">(</span><span class="nv">$cacheKey</span><span class="p">,</span><span class="nv">$results</span><span class="p">);</span> <span class="p">}</span> <span class="k">return</span> <span class="nv">$results</span><span class="p">;</span> <span class="p">}</span>
In addition to SQL queries, what other IO situations require caching? ——Of course, external network interface requests are indispensable! For example, when requesting Google search results, although the Google server processes the query very quickly, it still takes a long time to complete the request. Not only is it more time-consuming to establish an HTTPS connection than an HTTP connection, and this URL will generate a 302 jump, but also because ( This page has been blocked, please make up your own mind). The sample code is as follows:
<span class="k">function</span> <span class="nf">search_google</span><span class="p">(</span><span class="nv">$keywords</span><span class="p">)</span> <span class="p">{</span> <span class="k">global</span> <span class="nv">$memcache</span><span class="p">;</span> <span class="nv">$url</span><span class="o">=</span><span class="s1">'https://www.google.com/search?q='</span><span class="o">.</span><span class="nb">urlencode</span><span class="p">(</span><span class="nv">$keywords</span><span class="p">);</span> <span class="nv">$results</span><span class="o">=</span><span class="nv">$memcache</span><span class="o">-></span><span class="na">get</span><span class="p">(</span><span class="nv">$url</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="k">false</span><span class="o">===</span><span class="nv">$results</span><span class="p">)</span> <span class="p">{</span> <span class="c1">//缓存没有命中则发起请求</span> <span class="nv">$results</span><span class="o">=</span><span class="nb">file_get_contents</span><span class="p">(</span><span class="nv">$url</span><span class="p">);</span> <span class="nv">$memcache</span><span class="o">-></span><span class="na">set</span><span class="p">(</span><span class="nv">$url</span><span class="p">,</span><span class="nv">$results</span><span class="p">);</span> <span class="p">}</span> <span class="k">return</span> <span class="nv">$results</span><span class="p">;</span> <span class="p">}</span>
First stop and observe the previous two typical cache case codes. What kind of problems will you find?
How to solve these problems step by step? Code duplication means lack of abstraction! Let’s abstract these steps first. The use of cache generally follows this process:
if 缓存命中 then 返回缓存中的内容 else 执行操作 将操作结果放入缓存并返回结果 end
The "execute operation" step is the main logic of the program, while the others are the logic of the cache code. If you want to avoid code duplication, the only way is to separate the main logic code of the program from the cache operation code. Then we have to split the original function into two functions like this:
<span class="k">function</span> <span class="nf">cache_operation</span><span class="p">();</span> <span class="c1">//缓存操作代码,可重用</span> <span class="k">function</span> <span class="nf">program</span><span class="p">();</span> <span class="c1">//程序主体代码</span> <span class="c1"># 以search_google函数为例,抽取掉缓存检查代码后</span> <span class="c1"># 它就露出了原形 —— 我们会惊讶地发现原来主逻辑只有一行代码</span> <span class="k">function</span> <span class="nf">search_google</span><span class="p">(</span><span class="nv">$keywords</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nb">file_get_contents</span><span class="p">(</span><span class="s1">'https://www.google.com/search?q='</span><span class="o">.</span><span class="nb">urlencode</span><span class="p">(</span><span class="nv">$keywords</span><span class="p">));</span> <span class="p">}</span>
As for how to implement it, let’s put it aside. At this point, we have a clear direction to solve the code duplication problem. Next question: How to achieve unified management of cache keys? Obviously, of course, the cache Key is automatically generated in the reusable cache operation function.
What can we find after abstracting the problem? Before, we were still thinking divergently, trying to think about how many different situations of caching there are. Now, we find that all caching situations can be classified into one situation: caching the return results of executing specific operation functions. At the same time, a unified solution is obtained: put the operations that need to cache the results into a function, and only need to cache the return results of the function call. This function call caching technology has a special term: Memoization. Alas~ It took me a long time just to explain why Memoization is needed!
Well-designed functions usually try to follow these simple principles: inputting the same parameters always returns the same value (pure function), its return value is only affected by its parameters and does not depend on other global states; there are no side effects . For example, a simple addition function: <span class="k">function</span> <span class="nf">add</span><span class="p">(</span><span class="nv">$a</span><span class="p">,</span><span class="nv">$b</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nv">$a</span><span class="o">+</span><span class="nv">$b</span><span class="p">;</span> <span class="p">}</span>
. Such a function can safely use Memoization because it always returns the same result when inputting the same parameters, so the function name plus the parameter value is an automatically generated cache key. However, in reality, many functions (IO) that interact with external data do not have such characteristics. For example, the query_author_articles
function above, its return result is determined by the content in the database. If the Author has a new Article submitted to the database , then even if the parameter $author is the same, the return result will be different. For such a function, we need to use a workaround. We can approximately think that the behavioral consistency of functions containing IO within a certain cache cycle (that is, the same parameters return the same results) is guaranteed. For example, if we allow 10 minutes of caching for database queries, then we can think that the same query will return the same result within 10 minutes. In this way, functions containing IO can also use Memoization with cache expiration time, and this type of situation is more common in real environments.
According to the previous analysis, the interface and working mode of the Memoization function (that is, the cache_operation
function to be implemented listed above) should be like this:
<span class="sd">/**</span> <span class="sd">memoize_call 将$fn函数调用的结果进行缓存</span> <span class="sd">@param {String} $fn 执行操作的函数名</span> <span class="sd">@param {Array} $args 调用$fn的参数</span> <span class="sd">@return $fn执行返回的结果,也可能是从缓存中取到</span> <span class="sd">*/</span> <span class="k">function</span> <span class="nf">memoize_call</span><span class="p">(</span><span class="nv">$fn</span><span class="p">,</span><span class="nv">$args</span><span class="p">)</span> <span class="p">{</span> <span class="k">global</span> <span class="nv">$cache</span><span class="p">;</span> <span class="c1"># 函数名和序列化的参数值构成Cache Key</span> <span class="nv">$cacheKey</span><span class="o">=</span><span class="nv">$fn</span><span class="o">.</span><span class="s1">':'</span><span class="o">.</span><span class="nb">serialize</span><span class="p">(</span><span class="nv">$args</span><span class="p">);</span> <span class="c1"># 根据Cache Key,查询缓存</span> <span class="nv">$results</span><span class="o">=</span><span class="nv">$cache</span><span class="o">-></span><span class="na">get</span><span class="p">(</span><span class="nv">$cacheKey</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="k">false</span><span class="o">===</span><span class="nv">$results</span><span class="p">)</span> <span class="p">{</span> <span class="c1"># 缓存没有命中则调用函数</span> <span class="nv">$results</span><span class="o">=</span><span class="nb">call_user_func_array</span><span class="p">(</span><span class="nv">$fn</span><span class="p">,</span><span class="nv">$args</span><span class="p">);</span> <span class="nv">$cache</span><span class="o">-></span><span class="na">set</span><span class="p">(</span><span class="nv">$cacheKey</span><span class="p">,</span><span class="nv">$results</span><span class="p">);</span> <span class="p">}</span> <span class="k">return</span> <span class="nv">$results</span><span class="p">;</span> <span class="p">}</span> <span class="c1"># 下面给出一个示例的Cache实现,以便使这段代码可以成功运行</span> <span class="k">class</span> <span class="nc">DemoCache</span> <span class="p">{</span> <span class="k">private</span> <span class="nv">$store</span><span class="p">;</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">__construct</span><span class="p">()</span> <span class="p">{</span><span class="nv">$this</span><span class="o">-></span><span class="na">store</span><span class="o">=</span><span class="k">array</span><span class="p">();}</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">set</span><span class="p">(</span><span class="nv">$key</span><span class="p">,</span><span class="nv">$value</span><span class="p">)</span> <span class="p">{</span><span class="nv">$this</span><span class="o">-></span><span class="na">store</span><span class="p">[</span><span class="nv">$key</span><span class="p">]</span><span class="o">=</span><span class="nv">$value</span><span class="p">;}</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">get</span><span class="p">(</span><span class="nv">$key</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nb">array_key_exists</span><span class="p">(</span><span class="nv">$key</span><span class="p">,</span><span class="nv">$this</span><span class="o">-></span><span class="na">store</span><span class="p">)</span><span class="o">?</span><span class="nv">$this</span><span class="o">-></span><span class="na">store</span><span class="p">[</span><span class="nv">$key</span><span class="p">]</span><span class="o">:</span><span class="k">false</span><span class="p">;</span> <span class="p">}</span> <span class="p">}</span> <span class="nv">$cache</span><span class="o">=</span><span class="k">new</span> <span class="nx">DemoCache</span><span class="p">();</span> <span class="c1"># 比如对于add函数,调用方式为:</span> <span class="nx">memoize_call</span><span class="p">(</span><span class="s1">'add'</span><span class="p">,</span><span class="k">array</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span><span class="mi">6</span><span class="p">));</span>
Note that you need to use the serialization method serialize function to convert the parameters into String. The serialization method will strictly retain the type information of the value and ensure that the String Keys obtained by serializing different parameters will not conflict. In addition to the serialize
function, you can also consider using json_encode
and other methods.
At this point, we have implemented unified management of cache keys. But this implementation has brought us trouble. The original simple add($a,$b)
is now written as memoize_call('add',array($a,$b))
! This is simply anti-humanity!
…………So how to solve this problem?
......Maybe you can write like this:
<span class="k">function</span> <span class="nf">_search_google</span><span class="p">()</span> <span class="p">{</span><span class="cm">/*BA LA BA LA*/</span><span class="p">}</span> <span class="k">function</span> <span class="nf">search_google</span><span class="p">()</span> <span class="p">{</span><span class="k">return</span> <span class="nx">memoize_call</span><span class="p">(</span><span class="s1">'_search_google'</span><span class="p">,</span><span class="nb">func_get_args</span><span class="p">());}</span> <span class="c1">//缓存化的函数调用方式总算不再非主流了</span> <span class="k">echo</span> <span class="nx">search_google</span><span class="p">(</span><span class="s1">'Hacker'</span><span class="p">);</span> <span class="c1">//直接调用就行了</span>
至少正常一点了。但还是很麻烦啊,本来只要写一个函数的,现在要写两个函数了!这时,匿名函数闪亮登场!(没错,只有在PHP5.3之后才可以使用Closure,但这关头谁还敢提更反人类的create_function?)使用Closure重写一下会变成啥样呢?
<span class="k">function</span> <span class="nf">search_google</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="nx">memoize_call</span><span class="p">(</span><span class="k">function</span> <span class="p">()</span> <span class="p">{</span><span class="cm">/*BA LA BA LA*/</span><span class="p">},</span><span class="nb">func_get_args</span><span class="p">());</span> <span class="p">}</span>
还不是一样嘛!还是一堆重复的代码!程序主逻辑又被套了一层厚大衣!
别忙下结论,再看看这个:
<span class="k">function</span> <span class="nf">memoized</span><span class="p">(</span><span class="nv">$fn</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="k">function</span> <span class="p">()</span> <span class="k">use</span><span class="p">(</span><span class="nv">$fn</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nx">memoize_call</span><span class="p">(</span><span class="nv">$fn</span><span class="p">,</span><span class="nb">func_get_args</span><span class="p">());</span> <span class="p">};</span> <span class="p">}</span> <span class="k">function</span> <span class="nf">add</span><span class="p">(</span><span class="nv">$a</span><span class="p">,</span><span class="nv">$b</span><span class="p">)</span> <span class="p">{</span> <span class="k">return</span> <span class="nv">$a</span><span class="o">+</span><span class="nv">$b</span><span class="p">;</span> <span class="p">}</span> <span class="c1"># 生成新函数,不影响原来的函数</span> <span class="nv">$add</span><span class="o">=</span><span class="nx">memoized</span><span class="p">(</span><span class="s1">'add'</span><span class="p">);</span> <span class="c1"># 后面全部使用$add函数</span> <span class="nv">$add</span><span class="p">(</span><span class="mf">1E4</span><span class="p">,</span><span class="mf">3E4</span><span class="p">);</span>
是不是感觉清爽多了?……还不行?
是啊,仍然会有两个函数!但是这个真没办法,PHP就是个破语言!你没办法创建一个同名的新函数覆写掉以前的旧函数。如果是JavaScript完全可以这样写嘛:add=memoized(add)
,如果是Python还可以直接用Decorators多方便啊!
没办法,这就是PHP!
……不过,我们确实还有相对更好的办法的。仍然从削减冗余代码入手!看这一行:<span class="nv">$add</span><span class="o">=</span><span class="nx">memoized</span><span class="p">(</span><span class="s1">'add'</span><span class="p">);</span>
,如果我们可以通过规约要求Memoized函数名的生成具有固定的规律,那么生成新的缓存函数这个步骤就可以通过程序自动处理。比如,我们可以在规范中要求,所有需要Memoize的函数命名都使用_memoizable
后缀,然后自动生成去掉后缀的新的变量函数:
<span class="c1"># add函数声明时加一个后缀表示它是可缓存的</span> <span class="c1"># 对应自动创建的变量函数名就是$add</span> <span class="k">function</span> <span class="nf">add_memoizable</span><span class="p">(</span><span class="nv">$a</span><span class="p">,</span><span class="nv">$b</span><span class="p">)</span> <span class="p">{</span><span class="k">return</span> <span class="nv">$a</span><span class="o">+</span><span class="nv">$b</span><span class="p">;}</span> <span class="c1"># 自动发现那些具有指定后缀的函数</span> <span class="c1"># 并创建对应没有后缀的变量函数</span> <span class="k">function</span> <span class="nf">auto_create_memoized_function</span><span class="p">()</span> <span class="p">{</span> <span class="nv">$suffix</span><span class="o">=</span><span class="s1">'_memoizable'</span><span class="p">;</span> <span class="nv">$suffixLen</span><span class="o">=</span><span class="nb">strlen</span><span class="p">(</span><span class="nv">$suffix</span><span class="p">);</span> <span class="nv">$fns</span><span class="o">=</span><span class="nb">get_defined_functions</span><span class="p">();</span> <span class="k">foreach</span> <span class="p">(</span><span class="nv">$fns</span><span class="p">[</span><span class="s1">'user'</span><span class="p">]</span> <span class="k">as</span> <span class="nv">$f</span><span class="p">)</span> <span class="p">{</span> <span class="c1">//function name ends with suffix</span> <span class="k">if</span> <span class="p">(</span><span class="nx">substr_compare</span><span class="p">(</span><span class="nv">$f</span><span class="p">,</span><span class="nv">$suffix</span><span class="p">,</span><span class="o">-</span><span class="nv">$suffixLen</span><span class="p">)</span><span class="o">===</span><span class="mi">0</span><span class="p">)</span> <span class="p">{</span> <span class="nv">$newFn</span><span class="o">=</span><span class="nx">substr</span><span class="p">(</span><span class="nv">$f</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="o">-</span><span class="nv">$suffixLen</span><span class="p">);</span> <span class="nv">$GLOBALS</span><span class="p">[</span><span class="nv">$newFn</span><span class="p">]</span><span class="o">=</span><span class="nx">memoized</span><span class="p">(</span><span class="nv">$f</span><span class="p">);</span> <span class="p">}</span> <span class="p">}</span> <span class="p">}</span> <span class="c1"># 只需在所有可缓存函数声明之后添加上这个</span> <span class="nx">auto_create_memoized_function</span><span class="p">();</span> <span class="c1"># 就自动生成对应的没有后缀的变量函数了</span> <span class="nv">$add</span><span class="p">(</span><span class="mf">3.1415</span><span class="p">,</span><span class="mf">2.141</span><span class="p">);</span>
还不满意?好好的都变成了变量函数,使用起来仍然不便。其实,虽然全局函数我们拿它没辙,但我们还可以在对象的方法调用上做很多Hack!PHP的魔术方法提供了很多机会!
Class的静态方法__callStatic可用于拦截对未定义静态方法的调用,该特性也是在PHP5.3开始支持。将前面的命名后缀方案应用到对象静态方法上,事情就变得非常简单了。将需要应用缓存的函数定义为Class静态方法,在命名时添加上后缀,调用时则不使用后缀,通过__callStatic方法重载,自动调用缓存方法,一切就OK了。
<span class="nb">define</span><span class="p">(</span><span class="s1">'MEMOIZABLE_SUFFIX'</span><span class="p">,</span><span class="s1">'_memoizable'</span><span class="p">);</span> <span class="k">class</span> <span class="nc">Memoizable</span> <span class="p">{</span> <span class="k">public</span> <span class="k">static</span> <span class="k">function</span> <span class="nf">__callStatic</span><span class="p">(</span><span class="nv">$name</span><span class="p">,</span><span class="nv">$args</span><span class="p">)</span> <span class="p">{</span> <span class="nv">$realName</span><span class="o">=</span><span class="nv">$name</span><span class="o">.</span><span class="nx">MEMOIZABLE_SUFFIX</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="nb">method_exists</span><span class="p">(</span><span class="nx">__CLASS__</span><span class="p">,</span><span class="nv">$realName</span><span class="p">))</span> <span class="p">{</span> <span class="k">return</span> <span class="nx">memoize_call</span><span class="p">(</span><span class="nx">__CLASS__</span><span class="o">.</span><span class="s2">"::</span><span class="si">$realName</span><span class="s2">"</span><span class="p">,</span><span class="nv">$args</span><span class="p">);</span> <span class="p">}</span> <span class="k">throw</span> <span class="k">new</span> <span class="nx">Exception</span><span class="p">(</span><span class="s2">"Undefined method "</span><span class="o">.</span><span class="nx">__CLASS__</span><span class="o">.</span><span class="s2">"::</span><span class="si">$name</span><span class="s2">();"</span><span class="p">);</span> <span class="p">}</span> <span class="k">public</span> <span class="k">static</span> <span class="k">function</span> <span class="nf">search_memoizable</span><span class="p">(</span><span class="nv">$k</span><span class="p">)</span> <span class="p">{</span><span class="k">return</span> <span class="s2">"Searching:</span><span class="si">$k</span><span class="s2">"</span><span class="p">;}</span> <span class="p">}</span> <span class="c1"># 调用时则不添加后缀</span> <span class="k">echo</span> <span class="nx">Memoizable</span><span class="o">::</span><span class="na">search</span><span class="p">(</span><span class="s1">'Lisp'</span><span class="p">);</span>
同样对象实例方法也可使用这个Hack。在对象上调用一个不可访问方法时,__call会被调用。对照前面__callStatic依样画葫芦,只要稍作改动就可得到__call方法:
<span class="k">class</span> <span class="nc">Memoizable</span> <span class="p">{</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">__call</span><span class="p">(</span><span class="nv">$name</span><span class="p">,</span><span class="nv">$args</span><span class="p">)</span> <span class="p">{</span> <span class="nv">$realName</span><span class="o">=</span><span class="nv">$name</span><span class="o">.</span><span class="nx">MEMOIZABLE_SUFFIX</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="nb">method_exists</span><span class="p">(</span><span class="nv">$this</span><span class="p">,</span><span class="nv">$realName</span><span class="p">))</span> <span class="p">{</span> <span class="k">return</span> <span class="nx">memoize_call</span><span class="p">(</span><span class="k">array</span><span class="p">(</span><span class="nv">$this</span><span class="p">,</span><span class="nv">$realName</span><span class="p">),</span><span class="nv">$args</span><span class="p">);</span> <span class="p">}</span> <span class="k">throw</span> <span class="k">new</span> <span class="nx">Exception</span><span class="p">(</span><span class="s2">"Undefined method "</span><span class="o">.</span><span class="nb">get_class</span><span class="p">(</span><span class="nv">$this</span><span class="p">)</span><span class="o">.</span><span class="s2">"-></span><span class="si">$name</span><span class="s2">();"</span><span class="p">);</span> <span class="p">}</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">add_memoizable</span><span class="p">(</span><span class="nv">$a</span><span class="p">,</span><span class="nv">$b</span><span class="p">)</span> <span class="p">{</span><span class="k">return</span> <span class="nv">$a</span><span class="o">+</span><span class="nv">$b</span><span class="p">;}</span> <span class="p">}</span> <span class="c1"># 调用实例方法时不带后缀</span> <span class="nv">$m</span><span class="o">=</span><span class="k">new</span> <span class="nx">Memoizable</span><span class="p">;</span> <span class="nv">$m</span><span class="o">-></span><span class="na">add</span><span class="p">(</span><span class="mf">3E5</span><span class="p">,</span><span class="mf">7E3</span><span class="p">);</span>
运行一下,会得到一个错误。因为memoize_call
方法第一个参数只接受String类型的函数名,而PHP的call_user_func_array方法需要一个Array参数来表示一个对象方法调用,这里就传了个数组:<span class="nx">memoize_call</span><span class="p">(</span><span class="k">array</span><span class="p">(</span><span class="nv">$this</span><span class="p">,</span><span class="nv">$realName</span><span class="p">),</span><span class="nv">$args</span><span class="p">);</span>
。如果$fn
参数传入一个数组,生成缓存Key则成了问题。对于Class静态方法,可以使用Class::staticMethod
格式的字符串表示,与普通函数名并无差别。对于实例方法,最简单的方式是将memoize_call
修改成对$fn
参数也序列化成字符串以生成缓存Key:
<span class="k">function</span> <span class="nf">memoize_call</span><span class="p">(</span><span class="nx">callable</span> <span class="nv">$fn</span><span class="p">,</span><span class="nv">$args</span><span class="p">)</span> <span class="p">{</span> <span class="k">global</span> <span class="nv">$cache</span><span class="p">;</span> <span class="c1"># 函数名和参数值都进行序列化</span> <span class="nv">$cacheKey</span><span class="o">=</span><span class="nb">serialize</span><span class="p">(</span><span class="nv">$fn</span><span class="p">)</span><span class="o">.</span><span class="s1">':'</span><span class="o">.</span><span class="nb">serialize</span><span class="p">(</span><span class="nv">$args</span><span class="p">);</span> <span class="nv">$results</span><span class="o">=</span><span class="nv">$cache</span><span class="o">-></span><span class="na">get</span><span class="p">(</span><span class="nv">$cacheKey</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="k">false</span><span class="o">===</span><span class="nv">$results</span><span class="p">)</span> <span class="p">{</span> <span class="nv">$results</span><span class="o">=</span><span class="nb">call_user_func_array</span><span class="p">(</span><span class="nv">$fn</span><span class="p">,</span><span class="nv">$args</span><span class="p">);</span> <span class="nv">$cache</span><span class="o">-></span><span class="na">set</span><span class="p">(</span><span class="nv">$cacheKey</span><span class="p">,</span><span class="nv">$results</span><span class="p">);</span> <span class="p">}</span> <span class="k">return</span> <span class="nv">$results</span><span class="p">;</span> <span class="p">}</span>
PHP 5.4开始可以使用callable参数类型提示,见:Callable Type Hint。Callable类型的具体格式可见 is_callable 函数的示例。
但这样会带来一些不必要的开销。对于复杂的对象,它的被缓存方法可能只访问了它的一个属性,而直接序列化对象会将它全部属性值都序列化进Key,这样不但Key体积会变得很大,而且一旦其它不相关的属性值发生了变化,缓存也就失效了:
<span class="k">class</span> <span class="nc">Bar</span> <span class="p">{</span> <span class="k">public</span> <span class="nv">$tags</span><span class="o">=</span><span class="k">array</span><span class="p">(</span><span class="s1">'PHP'</span><span class="p">,</span><span class="s1">'Python'</span><span class="p">,</span><span class="s1">'Haskell'</span><span class="p">);</span> <span class="k">public</span> <span class="nv">$current</span><span class="o">=</span><span class="s1">'PHP'</span><span class="p">;</span> <span class="c1">#...这里省略实现Memoizable功能的__call方法</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">getCurrent_memoizable</span><span class="p">()</span> <span class="p">{</span><span class="k">return</span> <span class="nv">$this</span><span class="o">-></span><span class="na">current</span><span class="p">;}</span> <span class="p">}</span> <span class="nv">$b</span><span class="o">=</span><span class="k">new</span> <span class="nx">Bar</span><span class="p">;</span> <span class="nv">$b</span><span class="o">-></span><span class="na">getCurrent</span><span class="p">();</span> <span class="nv">$b</span><span class="o">-></span><span class="na">tags</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">=</span><span class="s1">'OCaml'</span><span class="p">;</span> <span class="c1"># 由于不相干的tags属性内容也被序列化放入Key</span> <span class="c1"># tags被修改后,该方法的缓存就失效了</span> <span class="nv">$b</span><span class="o">-></span><span class="na">getCurrent</span><span class="p">();</span> <span class="c1"># 会被再次执行</span> <span class="c1"># 但它的缓存不应该失效</span>
对此问题的第一反应可能是……将代码改成:只序列化该方法中使用到的属性值。随之而来的障碍是,我们根本没有办法在运行时分析出方法M到底访问了$this
的哪几个属性。作为一种尝试性方案,我们可以手动在代码中声明方法M访问了对象哪几个属性,可以在类中声明一个静态属性存放相关信息:
<span class="k">class</span> <span class="nc">Foo</span> <span class="p">{</span> <span class="k">public</span> <span class="nv">$current</span><span class="o">=</span><span class="s1">'PHP'</span><span class="p">;</span> <span class="k">public</span> <span class="nv">$hack</span><span class="o">=</span><span class="s1">'HACK IT'</span><span class="p">;</span> <span class="c1"># 存放方法与其访问属性列表的映射</span> <span class="k">public</span> <span class="k">static</span> <span class="nv">$methodUsedMembers</span><span class="o">=</span><span class="k">array</span><span class="p">(</span> <span class="s1">'getCurrent_memoizable'</span><span class="o">=></span><span class="s1">'current,hack'</span> <span class="c1"># getCurrent访问的两个属性</span> <span class="p">);</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">getCurrent_memoizable</span><span class="p">()</span> <span class="p">{</span> <span class="k">return</span> <span class="nv">$this</span><span class="o">-></span><span class="na">current</span><span class="o">.</span><span class="nv">$this</span><span class="o">-></span><span class="na">hack</span><span class="p">;</span> <span class="p">}</span> <span class="p">}</span> <span class="c1"># 这样memoize_call就可以通过$methodUsedMembers</span> <span class="c1"># 得到方法M对应要序列化的属性列表</span> <span class="c1"># 对应memoize_call中生成缓存Key的逻辑则是</span> <span class="k">if</span> <span class="p">(</span><span class="nb">is_array</span><span class="p">(</span><span class="nv">$fn</span><span class="p">)</span> <span class="o">&&</span> <span class="nb">is_object</span><span class="p">(</span><span class="nv">$fn</span><span class="p">[</span><span class="mi">0</span><span class="p">]))</span> <span class="p">{</span> <span class="k">list</span><span class="p">(</span><span class="nv">$o</span><span class="p">,</span><span class="nv">$m</span><span class="p">)</span><span class="o">=</span><span class="nv">$fn</span><span class="p">;</span> <span class="nv">$class</span><span class="o">=</span><span class="nb">get_class</span><span class="p">(</span><span class="nv">$o</span><span class="p">);</span> <span class="c1"># $members=$class::$methodUsedMembers[$m]; # PHP5.3才支持此语法</span> <span class="c1"># 如果是PHP5.3之前的版本,使用下面的方法</span> <span class="nv">$classVars</span><span class="o">=</span><span class="nb">get_class_vars</span><span class="p">(</span><span class="nv">$class</span><span class="p">);</span> <span class="nv">$members</span><span class="o">=</span><span class="nv">$classVars</span><span class="p">[</span><span class="s1">'methodUsedMembers'</span><span class="p">][</span><span class="nv">$m</span><span class="p">];</span> <span class="nv">$objVars</span><span class="o">=</span><span class="nb">get_object_vars</span><span class="p">(</span><span class="nv">$o</span><span class="p">);</span> <span class="nv">$objVars</span><span class="o">=</span><span class="nb">array_intersect_key</span><span class="p">(</span><span class="nv">$objVars</span><span class="p">,</span><span class="nb">array_flip</span><span class="p">(</span><span class="nb">explode</span><span class="p">(</span><span class="s1">','</span><span class="p">,</span><span class="nv">$members</span><span class="p">)));</span> <span class="c1"># 注意要加上以类名和方法名构成的Prefix</span> <span class="c1"># 因为get_object_vars转成数组丢了Class信息</span> <span class="nv">$cacheKey</span><span class="o">=</span><span class="nv">$class</span><span class="o">.</span><span class="s1">'::'</span><span class="o">.</span><span class="nv">$m</span><span class="o">.</span><span class="s1">'::'</span><span class="p">;</span> <span class="nv">$cacheKey</span><span class="o">.=</span><span class="nb">serialize</span><span class="p">(</span><span class="nv">$objVars</span><span class="p">)</span><span class="o">.</span><span class="s1">':'</span><span class="o">.</span><span class="nb">serialize</span><span class="p">(</span><span class="nv">$args</span><span class="p">);</span> <span class="p">}</span>
手动声明仍然很麻烦,仍然是在Repeat Yourself。如果本着Hack到底(分明是折腾到底)的精神,为了能自动获取方法访问过哪些属性,我们还可以依葫芦画瓢,参照前面Memoizable方法调用拦截,再搞出这样一个自动化方案:属性定义时也都添加上_memoizable
后缀,访问时则不带后缀,通过__get方法,我们就可以在方法执行完后,得到这一次该方法访问过的属性列表了(但Memoize不是需要在函数调用之前就要确定缓存Key么? 这样才能查看缓存是否命中以决定是否要执行该方法啊? 这个简单,对方法M访问了对象哪些属性也进行缓存,就不用每次都执行了):
<span class="k">class</span> <span class="nc">Foo</span> <span class="p">{</span> <span class="k">public</span> <span class="nv">$propertyHack_memoizable</span><span class="o">=</span><span class="s1">'Hack'</span><span class="p">;</span> <span class="k">public</span> <span class="nv">$accessHistory</span><span class="o">=</span><span class="k">array</span><span class="p">();</span><span class="c1">//记录属性访问历史</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">__get</span><span class="p">(</span><span class="nv">$name</span><span class="p">)</span> <span class="p">{</span> <span class="nv">$realName</span><span class="o">=</span><span class="nv">$name</span><span class="o">.</span><span class="nx">MEMOIZABLE_SUFFIX</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="nb">property_exists</span><span class="p">(</span><span class="nv">$this</span><span class="p">,</span><span class="nv">$realName</span><span class="p">))</span> <span class="p">{</span> <span class="nv">$this</span><span class="o">-></span><span class="na">accessHistory</span><span class="p">[]</span><span class="o">=</span><span class="nv">$realName</span><span class="p">;</span> <span class="k">return</span> <span class="nv">$this</span><span class="o">-></span><span class="nv">$realName</span><span class="p">;</span> <span class="p">}</span> <span class="c1"># otherwise throw Exception</span> <span class="p">}</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">hack</span><span class="p">()</span> <span class="p">{</span><span class="k">return</span> <span class="nv">$this</span><span class="o">-></span><span class="na">propertyHack</span><span class="p">;}</span> <span class="p">}</span> <span class="nv">$f</span><span class="o">=</span><span class="k">new</span> <span class="nx">Foo</span><span class="p">;</span> <span class="c1">#方法调用前清空历史</span> <span class="nv">$f</span><span class="o">-></span><span class="na">accessHistory</span><span class="o">=</span><span class="k">array</span><span class="p">();</span> <span class="k">echo</span> <span class="nv">$f</span><span class="o">-></span><span class="na">hack</span><span class="p">();</span> <span class="nb">var_dump</span><span class="p">(</span><span class="nv">$f</span><span class="o">-></span><span class="na">accessHistory</span><span class="p">);</span> <span class="c1"># => 得到hack方法访问过的属性列表</span>
不过,我们不能真的这么干!这样会把事情搞得越来越复杂。太邪门了!我们不能在错误的道路上越走越远!
适可而止吧!对于此问题,我觉得折衷方案是避免对实例方法进行缓存。因为实例方法通常都不是纯函数,它依赖于$this
的状态,因此它也不适用于Memoization。 正常情况下对静态方法缓存也已经够用了,如果实例方法需要缓存,可以考虑重构代码提取出一个可缓存的类静态方法出来。
如果要将这里的__callStatic
及__call
代码重用,可将其作为一个BaseClass,让需要Memoize功能的子类去继承:
<span class="k">class</span> <span class="nc">ArticleModel</span> <span class="k">extends</span> <span class="nx">Memoizable</span> <span class="p">{</span> <span class="k">public</span> <span class="k">static</span> <span class="k">function</span> <span class="nf">getByAuthor_memoizable</span><span class="p">()</span> <span class="p">{</span><span class="cm">/*...*/</span><span class="p">}</span> <span class="p">}</span>
试一下,便会发现这样是行不通的。在__callStatic
中,我们直接使用了Magic Constants:__CLASS__
,来得到当前类名。但这个变量的值是它在代码中所在的类的名称,而不是运行时调用此方法的类的名称。即这里的__CLASS__
的值永远是Memoizable
。这问题并不很难解决,只要升级到PHP5.3,将__CLASS__
替换成get_called_class()
就行了。然而还有另外一个问题,PHP的Class是不支持多继承的,如果一个类已经继承了另外一个类,就不好再使用继承的方式实现Memoize代码重用了。这问题仍然不难解决,只要升级到PHP5.4,使用Traits就可以实现Mixin了。并且,使用Traits之后,就可以直接使用__CLASS__常量而不需要改成调用get_called_class()
函数了,真是一举两得:
<span class="k">trait</span> <span class="nx">Memoizable</span> <span class="p">{</span> <span class="k">public</span> <span class="k">static</span> <span class="k">function</span> <span class="nf">__callStatic</span><span class="p">()</span> <span class="p">{</span> <span class="k">echo</span> <span class="nx">__CLASS__</span><span class="p">;</span> <span class="c1"># => 输出use trait的那个CLASS名称</span> <span class="p">}</span> <span class="k">public</span> <span class="k">function</span> <span class="nf">__call</span><span class="p">()</span> <span class="p">{</span><span class="cm">/*...*/</span><span class="p">}</span> <span class="p">}</span> <span class="k">class</span> <span class="nc">ArticleModel</span> <span class="p">{</span> <span class="k">use</span> <span class="nx">Memoizable</span><span class="p">;</span> <span class="k">public</span> <span class="k">static</span> <span class="k">function</span> <span class="nf">getByAuthor_memoizable</span><span class="p">()</span> <span class="p">{</span><span class="cm">/*...*/</span><span class="p">}</span> <span class="p">}</span>
只是你需要升级到PHP5.4。也许有一天一个新的PHP版本会支持Python那样的Decorators,不过那时估计我已不再关注PHP,更不会回来更新这篇文章的内容了。
前面讲到,在现实世界中,通常都是对IO操作进行缓存,而包含IO操作的函数都不是纯函数。纯函数的缓存可以永不过期,而IO操作都需要一个缓存过期时间。现在问题不是过期时间到底设置成多长,这个问题应该交给每个不同的函数去设定,因为不同的操作其缓存时长是不一样的。现在的问题是,我们已经将缓存函数抽取了出来,让函数代码自身无需关心具体的缓存操作。可现在又要自己设置缓存过期时长,需要向这个memoize_call
函数传递一个$expires
参数,以在$cache->set
时再传给MemCache实例。初级解决方案:继续使用前面提出的类静态属性配置方案。类中所有方法的缓存过期时长,也可以用一个Class::methodMemoizeExpires
数组来配置映射。不过,我们不能一直这样停留在初级阶段百年不变!设想中最好的方案当然是将缓存过期时长和方法代码放一起,分开来写肯定不利于维护。可如何实现呢?前面已经将PHP的魔术方法差不多都用遍了,现在必须换个招术了。一直被人遗忘在角落里的静态变量和反射机制,终于也能登上舞台表演魔术了!
缓存过期时间,声明成函数的一个静态变量:
<span class="k">function</span> <span class="nf">search_google_memoizable</span><span class="p">(</span><span class="nv">$keywords</span><span class="p">)</span> <span class="p">{</span> <span class="k">static</span> <span class="nv">$memoizeExpires</span><span class="o">=</span><span class="mi">600</span><span class="p">;</span><span class="c1">//单位:秒</span> <span class="p">}</span>
通过ReflectionFunction的getStaticVariables方法,即可获取到函数设置的$memoizeExpires
值:
<span class="nv">$rf</span><span class="o">=</span><span class="k">new</span> <span class="nx">ReflectionFunction</span><span class="p">(</span><span class="s1">'search_google_memoizable'</span><span class="p">);</span> <span class="nv">$staticVars</span><span class="o">=</span><span class="nv">$rf</span><span class="o">-></span><span class="na">getStaticVariables</span><span class="p">();</span> <span class="nv">$expires</span><span class="o">=</span><span class="nv">$staticVars</span><span class="p">[</span><span class="s1">'memoizeExpires'</span><span class="p">];</span>
举一反三,类静态方法及实例方法,都可以通过ReflectionClass、ReflectionMethod这些途径获取到静态变量的值。
前面讨论了那么多,大部分的篇幅都是在讨论如何让缓存化函数的调用方式和原来保持一致。 筋疲力竭之后又突然想起来,虽然PHP代码中无法覆盖一个已经定义的函数,但PHP C Extension则可以做到!正好,PECL上已经有一个C实现的Memoize模块,不过目前仍然是Beta版。可以通过下面的命令安装:
sudo pecl install memoize-beta
该模块工作方式正如前面PHP代码所想要实现却又实现不了的那样。它提供一个memoize
函数,将一个用户定义的函数修改成一个缓存化函数。主要步骤和前面的PHP实现方案并无二致,本质上是通过memoize("fn")
创建一个新的函数(类似前面PHP实现的memoized
),新的函数执行memoize_call
在缓存不命中时再调用原来的函数,只不过C扩展可以修改函数表,将旧的函数重命名成fn$memoizd
,将新创建的函数命名成fn
并覆盖用户定义的函数:
<span class="k">function</span> <span class="nf">hack</span><span class="p">(</span><span class="nv">$x</span><span class="p">)</span> <span class="p">{</span><span class="nb">sleep</span><span class="p">(</span><span class="mi">3</span><span class="p">);</span><span class="k">return</span> <span class="s2">"Hack </span><span class="si">$x</span><span class="se">\n</span><span class="s2">"</span><span class="p">;}</span> <span class="nx">memoize</span><span class="p">(</span><span class="s1">'hack'</span><span class="p">);</span> <span class="k">echo</span> <span class="nx">hack</span><span class="p">(</span><span class="s1">'PHP'</span><span class="p">);</span> <span class="c1"># returns in 3s</span> <span class="k">echo</span> <span class="nx">hack</span><span class="p">(</span><span class="s1">'PHP'</span><span class="p">);</span> <span class="c1"># returns in 0.0001s</span> <span class="nv">$fns</span><span class="o">=</span><span class="nb">get_defined_functions</span><span class="p">();</span> <span class="k">echo</span> <span class="nb">implode</span><span class="p">(</span><span class="s1">' '</span><span class="p">,</span><span class="nv">$fns</span><span class="p">[</span><span class="s1">'user'</span><span class="p">]);</span> <span class="c1"># => hack$memoizd</span> <span class="c1"># 函数hack现在变成internal了</span> <span class="nb">var_dump</span><span class="p">(</span><span class="nb">in_array</span><span class="p">(</span><span class="s1">'hack'</span><span class="p">,</span><span class="nv">$fns</span><span class="p">[</span><span class="s1">'internal'</span><span class="p">]));</span> <span class="c1"># => bool(true)</span>
由于新函数是memcpy
其内置函数memoize_call
,所以变成了internal,分析下memoize
函数部分C代码可知:
<span class="n">PHP_FUNCTION</span><span class="p">(</span><span class="n">memoize</span><span class="p">)</span> <span class="p">{</span> <span class="n">zval</span> <span class="o">*</span><span class="n">callable</span><span class="p">;</span> <span class="cm">/*...*/</span> <span class="n">zend_function</span> <span class="o">*</span><span class="n">fe</span><span class="p">,</span> <span class="o">*</span><span class="n">dfe</span><span class="p">,</span> <span class="n">func</span><span class="p">,</span> <span class="o">*</span><span class="n">new_dfe</span><span class="p">;</span> <span class="cm">/*默认为全局函数表,EG宏获取当前的executor_globals*/</span> <span class="n">HashTable</span> <span class="o">*</span><span class="n">function_table</span> <span class="o">=</span> <span class="n">EG</span><span class="p">(</span><span class="n">function_table</span><span class="p">);</span> <span class="cm">/*...*/</span> <span class="cm">/*检查第一个参数是否is_callable*/</span> <span class="k">if</span> <span class="p">(</span><span class="n">Z_TYPE_P</span><span class="p">(</span><span class="n">callable</span><span class="p">)</span> <span class="o">==</span> <span class="n">IS_ARRAY</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/*callable是数组则可能为类静态方法或对象实例方法*/</span> <span class="n">zval</span> <span class="o">**</span><span class="n">fname_zv</span><span class="p">,</span> <span class="o">**</span><span class="n">obj_zv</span><span class="p">;</span> <span class="cm">/*省略:obj_zv=callable[0],fname_zv=callable[1]*/</span> <span class="k">if</span> <span class="p">(</span><span class="n">obj_zv</span> <span class="o">&&</span> <span class="n">fname_zv</span> <span class="o">&&</span> <span class="p">(</span><span class="n">Z_TYPE_PP</span><span class="p">(</span><span class="n">obj_zv</span><span class="p">)</span><span class="o">==</span><span class="n">IS_OBJECT</span> <span class="o">||</span> <span class="n">Z_TYPE_PP</span><span class="p">(</span><span class="n">obj_zv</span><span class="p">)</span><span class="o">==</span><span class="n">IS_STRING</span><span class="p">)</span> <span class="o">&&</span> <span class="n">Z_TYPE_PP</span><span class="p">(</span><span class="n">fname_zv</span><span class="p">)</span><span class="o">==</span><span class="n">IS_STRING</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* looks like a valid callback */</span> <span class="n">zend_class_entry</span> <span class="o">*</span><span class="n">ce</span><span class="p">,</span> <span class="o">**</span><span class="n">pce</span><span class="p">;</span> <span class="k">if</span> <span class="p">(</span><span class="n">Z_TYPE_PP</span><span class="p">(</span><span class="n">obj_zv</span><span class="p">)</span><span class="o">==</span><span class="n">IS_OBJECT</span><span class="p">)</span> <span class="p">{</span><span class="cm">/*obj_zv是对象*/</span> <span class="cm">/*获取对象的class entry,见zend_get_class_entry*/</span> <span class="n">ce</span> <span class="o">=</span> <span class="n">Z_OBJCE_PP</span><span class="p">(</span><span class="n">obj_zv</span><span class="p">);</span> <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">Z_TYPE_PP</span><span class="p">(</span><span class="n">obj_zv</span><span class="p">)</span><span class="o">==</span><span class="n">IS_STRING</span><span class="p">)</span> <span class="p">{</span><span class="cm">/*obj_zv为string则是类名*/</span> <span class="k">if</span> <span class="p">(</span><span class="n">zend_lookup_class</span><span class="p">(</span><span class="n">Z_STRVAL_PP</span><span class="p">(</span><span class="n">obj_zv</span><span class="p">),</span> <span class="n">Z_STRLEN_PP</span><span class="p">(</span><span class="n">obj_zv</span><span class="p">),</span><span class="o">&</span><span class="n">pce</span> <span class="n">TSRMLS_CC</span><span class="p">)</span><span class="o">==</span><span class="n">FAILURE</span><span class="p">){</span><span class="cm">/*...*/</span><span class="p">}</span> <span class="n">ce</span> <span class="o">=</span> <span class="o">*</span><span class="n">pce</span><span class="p">;</span> <span class="p">}</span> <span class="cm">/*当callable为array时,则使用该Class的函数表*/</span> <span class="n">function_table</span> <span class="o">=</span> <span class="o">&</span><span class="n">ce</span><span class="o">-></span><span class="n">function_table</span><span class="p">;</span> <span class="cm">/*PHP中函数名不区分大小写,所以这里全转成小写*/</span> <span class="n">fname</span> <span class="o">=</span> <span class="n">zend_str_tolower_dup</span><span class="p">(</span><span class="n">Z_STRVAL_PP</span><span class="p">(</span><span class="n">fname_zv</span><span class="p">),</span><span class="n">Z_STRLEN_PP</span><span class="p">(</span><span class="n">fname_zv</span><span class="p">));</span> <span class="n">fname_len</span> <span class="o">=</span> <span class="n">Z_STRLEN_PP</span><span class="p">(</span><span class="n">fname_zv</span><span class="p">);</span> <span class="cm">/*检查方法是否存在*/</span> <span class="k">if</span> <span class="p">(</span><span class="n">zend_hash_exists</span><span class="p">(</span><span class="n">function_table</span><span class="p">,</span><span class="n">fname</span><span class="p">,</span><span class="n">fname_len</span><span class="o">+</span><span class="mi">1</span><span class="p">)</span><span class="o">==</span><span class="n">FAILURE</span><span class="p">)</span> <span class="p">{</span><span class="cm">/*RET FALSE*/</span><span class="p">}</span> <span class="p">}</span> <span class="k">else</span> <span class="p">{</span><span class="cm">/*RET FALSE*/</span><span class="p">}</span> <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">Z_TYPE_P</span><span class="p">(</span><span class="n">callable</span><span class="p">)</span> <span class="o">==</span> <span class="n">IS_STRING</span><span class="p">)</span> <span class="p">{</span><span class="cm">/*普通全局函数,省略*/</span> <span class="p">}</span> <span class="k">else</span> <span class="p">{</span><span class="cm">/*RET FALSE*/</span><span class="p">}</span> <span class="cm">/* find source function */</span> <span class="k">if</span> <span class="p">(</span><span class="n">zend_hash_find</span><span class="p">(</span><span class="n">function_table</span><span class="p">,</span><span class="n">fname</span><span class="p">,</span><span class="n">fname_len</span><span class="o">+</span><span class="mi">1</span><span class="p">,(</span><span class="kt">void</span><span class="o">**</span><span class="p">)</span><span class="o">&</span><span class="n">fe</span><span class="p">)</span><span class="o">==</span><span class="n">FAILURE</span><span class="p">){</span><span class="cm">/*..*/</span><span class="p">}</span> <span class="k">if</span> <span class="p">(</span><span class="n">MEMOIZE_IS_HANDLER</span><span class="p">(</span><span class="n">fe</span><span class="p">))</span> <span class="p">{</span><span class="cm">/*已经被memoize缓存化过了,RET FALSE*/</span><span class="p">}</span> <span class="k">if</span> <span class="p">(</span><span class="n">MEMOIZE_RETURNS_REFERENCE</span><span class="p">(</span><span class="n">fe</span><span class="p">))</span> <span class="p">{</span><span class="cm">/*不接受返回引用的函数,RET FALSE*/</span><span class="p">}</span> <span class="n">func</span> <span class="o">=</span> <span class="o">*</span><span class="n">fe</span><span class="p">;</span> <span class="n">function_add_ref</span><span class="p">(</span><span class="o">&</span><span class="n">func</span><span class="p">);</span> <span class="cm">/* find dest function,dfe=memoize_call */</span> <span class="cm">/* copy dest entry with source name */</span> <span class="n">new_dfe</span> <span class="o">=</span> <span class="n">emalloc</span><span class="p">(</span><span class="k">sizeof</span><span class="p">(</span><span class="n">zend_function</span><span class="p">));</span> <span class="cm">/*从memoize_call函数复制出一个新函数,memoize_call本身是internal的*/</span> <span class="cm">/*其实可以通过new_def->type=ZEND_USER_FUNCTION将其设置成用户函数*/</span> <span class="n">memcpy</span><span class="p">(</span><span class="n">new_dfe</span><span class="p">,</span> <span class="n">dfe</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">zend_function</span><span class="p">));</span> <span class="cm">/*将复制出的memoize_call函数的scope设置成原函数的scope*/</span> <span class="n">new_dfe</span><span class="o">-></span><span class="n">common</span><span class="p">.</span><span class="n">scope</span> <span class="o">=</span> <span class="n">fe</span><span class="o">-></span><span class="n">common</span><span class="p">.</span><span class="n">scope</span><span class="p">;</span> <span class="cm">/*将新函数名称设置成和原函数相同*/</span> <span class="n">new_dfe</span><span class="o">-></span><span class="n">common</span><span class="p">.</span><span class="n">function_name</span> <span class="o">=</span> <span class="n">fe</span><span class="o">-></span><span class="n">common</span><span class="p">.</span><span class="n">function_name</span><span class="p">;</span> <span class="cm">/*修改function_table,将原函数名映射到新函数new_dfe*/</span> <span class="k">if</span> <span class="p">(</span><span class="n">zend_hash_update</span><span class="p">(</span><span class="n">function_table</span><span class="p">,</span><span class="n">fname</span><span class="p">,</span> <span class="n">fname_len</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span><span class="n">new_dfe</span><span class="p">,</span><span class="k">sizeof</span><span class="p">(</span><span class="n">zend_function</span><span class="p">),</span><span class="nb">NULL</span><span class="p">)</span><span class="o">==</span><span class="n">FAILURE</span><span class="p">){</span><span class="cm">/*..*/</span><span class="p">}</span> <span class="k">if</span> <span class="p">(</span><span class="n">func</span><span class="p">.</span><span class="n">type</span> <span class="o">==</span> <span class="n">ZEND_INTERNAL_FUNCTION</span><span class="p">)</span> <span class="p">{</span><span class="cm">/*省略对internal函数的特殊处理*/</span><span class="p">}</span> <span class="k">if</span> <span class="p">(</span><span class="n">ttl</span><span class="p">)</span> <span class="p">{</span><span class="cm">/*省略ttl设置*/</span><span class="p">}</span> <span class="cm">/*原函数重命名成 fname$memoizd并添加到函数表*/</span> <span class="n">new_fname_len</span> <span class="o">=</span> <span class="n">spprintf</span><span class="p">(</span><span class="o">&</span><span class="n">new_fname</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="s">"%s%s"</span><span class="p">,</span> <span class="n">fname</span><span class="p">,</span> <span class="n">MEMOIZE_FUNC_SUFFIX</span><span class="p">);</span> <span class="k">if</span> <span class="p">(</span><span class="n">zend_hash_add</span><span class="p">(</span><span class="n">function_table</span><span class="p">,</span><span class="n">new_fname</span><span class="p">,</span> <span class="n">new_fname_len</span><span class="o">+</span><span class="mi">1</span><span class="p">,</span><span class="o">&</span><span class="n">func</span><span class="p">,</span><span class="k">sizeof</span><span class="p">(</span><span class="n">zend_function</span><span class="p">),</span><span class="nb">NULL</span><span class="p">)</span><span class="o">==</span><span class="n">FAILURE</span><span class="p">){</span><span class="cm">/*RET FALSE*/</span><span class="p">}</span> <span class="p">}</span>
其memoize_call
函数是不可以直接调用的,它只专门用来被复制以生成新函数的,其执行时通过自己的函数名找到对应要执行的原函数,并且同样使用serialize
方法序列化参数,并取序列化结果字符串的MD5值作为缓存Key。
附部分Zend API函数参考:zend_get_class_entry、EG:Executor Globals、zend_function,以上均可通过站点http://lxr.php.net/搜索到。 另可参考:深入理解PHP内核——PHP函数内部实现。
其它参见Github上的源码和文档:https://github.com/arraypad/php-memoize
PECL Memoize Package:http://pecl.php.net/package/memoize
完整的Memoization的PHP实现参见:https://github.com/jex-im/anthology/tree/master/php/Memoize
该实现覆盖了很多其它的边缘问题。比如通过Reflection API,实现了将方法参数默认值也序列化到缓存Key的功能。不过该实现只支持PHP5.4以后的版本。
原文地址:http://jex.im/programming/memoization-in-php.html