A Cache callback and automatic triggering technology based on PHP

Background

When using Memcache or Redis in PHP, we usually encapsulate Memcache and Redis and write a Cache class separately as a proxy for Memcache or Redis, and Generally singleton mode. In the business code, when using the Cache class, the basic sample code of the operation is as follows

// cache 的 key
$key = 'this is key';
$expire = 60;// 超时时间

// cache 的实例
$cache = Wk_Cache::instance();
$data = $cache->fetch($key);

// 判断data
if(empty($data)){
    // 如果为空，调用db方法
    $db = new Wk_DB();
    $data = $db->getXXX();
    $cache->store($key, $data, $expire);
}
// 处理$data相关数据
return $data;

Copy after login

The basic process is
Step 1 , first assemble the query key and query the value in Cache. If it exists, continue processing and enter the third step; if it does not exist, enter the second step
The second step is to query the relevant data in the DB according to the request. If the data exists, put the data into the Cache
The third step is to process the data returned in the cache or db

Problem

The above process will basically appear in every call to the Cache. Query the cache first. If not, call the DB or third-party interface to obtain the data, store it in the cache again, and continue data processing. Since multiple calls are a problem, this query method should be encapsulated into a lower-level method. Instead of repeating such logic every time, in addition to encapsulation issues, there are other issues. Let’s list them together

First: from a design perspective, repeated code requires lower-level logic encapsulation.
Second: The assembly of the key is cumbersome and cumbersome. In actual situations, various parameters may be assembled. During maintenance, you dare not modify it.
Third: The set expire timeout will be scattered in various logical codes, making it difficult to calculate the Cache cache time in the end.
Fourth: Since the cache->store method needs to be executed after calling the db, if there are other logical processes after the db, it is possible to forget to put the data into the cache, resulting in data loss.
Fifth: In a high-concurrency system, when the cache fails, a large number of requests will directly penetrate to the rear, causing the pressure on the DB or third-party interface to rise sharply, slowing down the response, and further affecting the stability of the system. This phenomenon for "Dogpile".

Among the above problems, the simplest ones are 2 and 3. For the problem of scattered expire timeouts, we can solve it through a unified configuration file. For example, we can create such a configuration file.

“test"=>array( // namespace，方便分组
             "keys"=> array(
                 “good”=>array(		// 定义的key，此key非最终入cache的key，入key需要和params组装成唯一的key
                     "timeout"=>600,	// 此处定义超时时间
                     "params"=>array("epid"=>1,"num"=>1),	// 通过这种方法，描述需要传递参数，用于组装最终入cache的key
                     "desc"=>"描述"
                     ), 
                "top_test"=>array(	// 定义的key，此key非最终入cache的key，入key需要和params组装成唯一的key
                     "timeout"=>60,	// 此处定义超时时间
                     "ttl"=>10,	// 自动触发时间
                     "params"=>array('site_id'=>1,'boutique'=>1,'offset'=>1,'rows'=> 1,'uid'=>1,'tag_id'=>1,'type'=>1),	// 通过这种方法，描述需要传递参数，用于组装最终入cache的key
                     "desc"=>"描述",
                     "author"=>"ugg",
                     ), 

）
）

Copy after login

As shown above, through an algorithm, we can assemble site_top_feeds and params into a unique warehousing key. The assembled key , something like this site_top_feeds_site_id=12&boutique=1&offset=0&rows=20&uid=&tag_id=0&type=2 In this way, we prevent workers from assembling the key themselves, thereby eliminating the second problem. At the same time, in this configuration file, we also set a timeout , when calling the store, we can read directly from the configuration file, thus avoiding the third problem. After the above modifications, our cache method has also been appropriately adjusted. The calling example is as follows.

$siteid = 121;
$seminal = 1;
$tag_id = 12;
$tag_id = 22;

$data =  fetch(‘site_top_feeds’,array('site_id'=>$siteid,'boutique'=>$seminal, 'offset'=>"0", 'rows' => "20", 'uid' =>null,’tag_id’=>$tag_id,’type'=>$type),'feed');
if(empty($data)){
// db相关操作
$db = new Wk_DB();
    $data = $db->getTopFeeds($site_id,$seminal,0,20,null,$tag_id,$type);
//  $data数据其他处理逻辑 这里
……

$cache->store(‘site_top_feeds’,$data,array(‘site_id'=>$siteid,'boutique'=>$seminal, 'offset'=>"0", 'rows' => "20", 'uid' =>null,’tag_id’=>$tag_id,’type'=>$type),'feed');
}

Copy after login

Through the above solution, I did not see that the timeout timeout is gone and the key assembly is gone. For outer calls, It's transparent. But we can know the timeout of site_top_feeds through the configuration file, and what the assembled key looks like through the encapsulated algorithm.
This method does not solve the first and fourth problems, encapsulation; in order to achieve encapsulation, the first thing to do is the callback function. As a school-based language, PHP does not have a complete concept of function pointers. , of course, you don’t actually need a pointer to execute a function. PHP supports callback functions in two ways: call_user_func and call_user_func_array.
However, I made two examples and found that the execution efficiency of the above method is much different than the native method

native:0.0097959041595459s
call_user_func:0.028249025344849s
call_user_func_array:0.046605110168457s

Copy after login

The example code is as follows:

$s = microtime(true);
for($i=0; $i< 10000 ; ++$i){
    $a = new a();
    $data = $a->aaa($array, $array, $array);
    $data = a::bbb($array, $array, $array);
}
$e = microtime(true);
echo "native:".($e-$s)."s\n";

$s = microtime(true);
for($i=0; $i< 10000 ; ++$i){
    $a = new a();
    $data = call_user_func(array($a,'aaa'),$array,$array,$array);
    $data = call_user_func(array('a','bbb'),$array,$array,$array);
}
$e = microtime(true);
echo "call_user_func:".($e-$s)."s\n";

$s = microtime(true);
for($i=0; $i< 10000 ; ++$i){
    $a = new a();
    $data = call_user_func_array(array($a,'aaa'),array(&$array,&$array,&$array));
    $data = call_user_func_array(array('a','bbb'),array(&$array,&$array,&$array));
}
$e = microtime(true);
echo “call_user_func_array:".($e-$s)."s\n";

Copy after login

In PHP, if you know an object and a method, it is actually very simple to call the method. For example, in the example above

$a = new a();
$data = $a->aaa($array, $array, $array);
$obj = $a;
$func = ‘aaa’;
$params = array($array,$array,$array);
$obj->$func($params[0],$params[1],$params[2]);	// 通过这种方式可以直接执行

Copy after login

What is the execution performance of this method? In this way, after our comparative testing, we found that

native:0.0092940330505371s
call_user_func:0.028635025024414s
call_user_func_array:0.048038959503174s
my_callback:0.11308288574219s

Copy after login

added a large number of method strategies to verify that the performance loss was relatively low, and the time consumption was only about 1.25 times that of the native method, much less than more than 3 times that of call_user_func, and that of call_user_func_array More than 5 times, the specific encapsulated code

switch(count($params)){
                case 0: $result = $obj->{$func}();break;
                case 1: $result = $obj->{$func}($params[0]);break;
                case 2: $result = $obj->{$func}($params[0],$params[1]);break;
                case 3: $result = $obj->{$func}($params[0],$params[1],$params[2]);break;
                case 4: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3]);break;
                case 5: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3],$params[4]);break;
                case 6: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3],$params[4],$params[5]);break;
                case 7: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3],$params[4],$params[5],$params[6]);break;
                default: $result = call_user_func_array(array($obj, $func), $params);  break;
            }

Copy after login

Before using this method, I considered using create_function to create anonymous functions and execute function callbacks. After testing, create_function can only create Global functions cannot create class functions and object functions, so I gave up.

After completing the above preparations, you can use the callback mechanism and call the business code again

….
// 相关变量赋值
$db = new Wk_DB();
$callback['obj'] = $db;
            $callback['func'] = 'getTopFeeds';
            $callback['params'] = array('site_id'=>$siteid,'boutique'=>$seminal, 'offset'=>"0", 'rows' => "20", 'uid' =>null,'tag_id'=>$tag_id,'type'=>$type);

            $top_feed_list = $cache->smart_fetch('site_top_feeds',$callback,'feed');

Copy after login

使用以上方法实现对cache调用的封装，同时保证性能的高效，从而解决第一和第四个问题。

至此已经完成前四个问题，从而实现Cache的封装，并有效的避免了上面提到的第二，第三，第四个问题。但是对于第五个问题，dogpile问题，并没有解决，针对这种问题，最好的方式是在cache即将失效前，有一个进程主动触发db操作，获取DB数据放入Cache中，而其他进程正常从Cache中获取数据（因为此时cache并未失效）；好在有Redis缓存，我们可以使用Redis的两个特性很好解决这个问题，先介绍下这两个接口
TTL方法：以秒为单位，返回给定 key 的剩余生存时间 (TTL, time to live)，当 key 不存在时，返回 -2 。当 key 存在但没有设置剩余生存时间时，返回 -1 。否则，以秒为单位，返回 key 的剩余生存时间。很明显，通过这个方法，我们很容易知道key的还剩下的生存时间，通过这个方法，可以在key过期前做点事情，但是光有这个方法还不行，我们需要确保只有进程执行，而不是所有的进程都做，正好用到下面这个方法。
SETNX方法：将 key 的值设为 value ，当且仅当 key 不存在。若给定的 key 已经存在，则SETNX 不做任何动作。SETNX 是『SET if Not eXists』(如果不存在，则 SET) 的简写。返回值：设置成功，返回 1 。设置失败，返回 0 。通过这个方法，模拟分布式加锁，保证只有一个进程做执行，而其他的进程正常处理。结合以上Redis方法的特性，解决第五种的问题的，实例代码。

…
// 变量初始化
$key = “this is key”;
$expiration = 600; 
$recalculate_at = 100;
$lock_length = 20;
$data = $cache->fetch($key); 
$ttl = $cache->redis->ttl($key); 
if($recalculate_at>=$ttl&&$r->setnx("lock:".$key,true)){ 
$r->expire(“lock:”.$key, $lock_length);
$db = new Wk_DB();
  $data = $db->getXXX();
  $cache->store($key, $expiration, $value);
}

Copy after login

解决方案

好了，关键核心代码如下
1：function回调部分代码

public static function callback($callback){
        // 安全检查
        if(!isset($callback['obj']) || !isset($callback['func'])
            || !isset($callback['params']) || !is_array($callback['params'])){
            throw new Exception("CallBack Array Error");
        }   
        // 利用反射，判断对象和函数是否存在
        $obj = $callback['obj'];
        $func = $callback['func'];
        $params = $callback['params'];
        // 方法判断        
        $method = new ReflectionMethod($obj,$func);
        if(!$method){
            throw new Exception("CallBack Obj Not Find func");
        }   

        // 方法属性判断
        if (!($method->isPublic() || $method->isStatic())) {
            throw new Exception("CallBack Obj func Error");
        }   

        // 参数个数判断(不进行逐项检测)
        $paramsNum = $method->getNumberOfParameters();
        if($paramsNum < count($params)){
            throw new Exception("CallBack Obj Params Error");
        }   

        // 6个参数以内，逐个调用,超过6个，直接调用call_user_func_array
        $result = false;
        // 判断静态类方法
        if(!is_object($obj) && $method->isStatic()){
            switch(count($params)){
                case 0: $result = $obj::{$func}();break;
		case 1: $result = $obj::{$func}($params[0]);break;
                case 2: $result = $obj::{$func}($params[0],$params[1]);break;
                case 3: $result = $obj::{$func}($params[0],$params[1],$params[2]);break;
                case 4: $result = $obj::{$func}($params[0],$params[1],$params[2],$params[3]);break;
                case 5: $result = $obj::{$func}($params[0],$params[1],$params[2],$params[3],$params[4]);break;
                case 6: $result = $obj::{$func}($params[0],$params[1],$params[2],$params[3],$params[4],$params[5]);break;
                case 7: $result = $obj::{$func}($params[0],$params[1],$params[2],$params[3],$params[4],$params[5],$params[6]);break;
                default: $result = call_user_func_array(array($obj, $func), $params);  break;
            }
        }else{
            switch(count($params)){
                case 0: $result = $obj->{$func}();break;
                case 1: $result = $obj->{$func}($params[0]);break;
                case 2: $result = $obj->{$func}($params[0],$params[1]);break;
                case 3: $result = $obj->{$func}($params[0],$params[1],$params[2]);break;
                case 4: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3]);break;
                case 5: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3],$params[4]);break;
                case 6: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3],$params[4],$params[5]);break;
                case 7: $result = $obj->{$func}($params[0],$params[1],$params[2],$params[3],$params[4],$params[5],$params[6]);break;
                default: $result = call_user_func_array(array($obj, $func), $params);  break;
            }
        }

Copy after login

2：自动触发回调机制

public function smart_fetch($key,$callback,$namespace="wk") {
	key = $prefix.$key.$suffix;
        $result = $this->_redis->get($key);

        $bttl = false;
        // ttl状态判断(注意冷启动)
        if(!empty($ttl)){
            // 获得过期时间
            $rttl = $this->_redis->ttl($key);
            if($rttl > 0 && $ttl >= $rttl &&
                $this->_redis->setnx("lock".$key,true)){
                // 设置超时时间(超时时间3秒)
                $this->_redis->expire("lock".$key,3);
                $bttl = true;
            }
        }
	// 如何返回值不存在，调用回调函数，获取数值，并保持数据库
        if($bttl || !$result || (isset($CONFIG['FLUSH']) && !empty($CONFIG['FLUSH']))){
            // 重新调整参数
            $callbackparams = array();
            foreach($params as $k=>$value){
                $callbackparams[] = $value;
            }
            $callback['params'] = $callbackparams;
            $result = Wk_Common::callback($callback);
            $expire = $key_config["timeout"];
            // 存储数据
            $status = $this->_redis->setex($key, $expire, $result);
            $result=$this->_redis->get($key);
        }

        // 删除锁
        if($bttl){
            $this->_redis->delete("lock".$key);
        }
        return $result;
    }

Copy after login

至此，我们使用脚本语言特性，通过user_call_func_array方法补齐所有函数回调机制，从而实现对Cache的封装，通过配置文件定义组装key的规则和每个key的超时时间，再通过Redis的ttl和setnx特性，保证只有一个进程执行DB操作，从而很好避免dogpile问题，实现cache自动触发，保证cache持续存在数据，并且有效减少DB的访问次数，提高性能。

A Cache callback and automatic triggering technology based on PHP_PHP tutorial

A Cache callback and automatic triggering technology based on PHP