PHP has a set of process control functions (–enable-pcntl and posix extensions are required when compiling) , allowing PHP to achieve the same functions as C, such as creating child processes, using the exec function to execute programs, and processing signals.
<?php header('content-type:text/html;charset=utf-8' ); // 必须加载扩展 if (!function_exists("pcntl_fork")) { die("pcntl extention is must !"); } //总进程的数量 $totals = 3; // 执行的脚本数量 $cmdArr = array(); // 执行的脚本数量的数组 for ($i = 0; $i < $totals; $i++) { $cmdArr[] = array("path" => __DIR__ . "/run.php", 'pid' =>$i ,'total' =>$totals); } /* 展开:$cmdArr Array ( [0] => Array ( [path] => /var/www/html/company/pcntl/run.php [pid] => 0 [total] => 3 ) [1] => Array ( [path] => /var/www/html/company/pcntl/run.php [pid] => 1 [total] => 3 ) [2] => Array ( [path] => /var/www/html/company/pcntl/run.php [pid] => 2 [total] => 3 ) ) */ pcntl_signal(SIGCHLD, SIG_IGN); //如果父进程不关心子进程什么时候结束,子进程结束后,内核会回收。 foreach ($cmdArr as $cmd) { $pid = pcntl_fork(); //创建子进程 //父进程和子进程都会执行下面代码 if ($pid == -1) { //错误处理:创建子进程失败时返回-1. die('could not fork'); } else if ($pid) { //父进程会得到子进程号,所以这里是父进程执行的逻辑 //如果不需要阻塞进程,而又想得到子进程的退出状态,则可以注释掉pcntl_wait($status)语句,或写成: pcntl_wait($status,WNOHANG); //等待子进程中断,防止子进程成为僵尸进程。 } else { //子进程得到的$pid为0, 所以这里是子进程执行的逻辑。 $path = $cmd["path"]; $pid = $cmd['pid'] ; $total = $cmd['total'] ; echo exec("/usr/bin/php {$path} {$pid} {$total}")."\n"; exit(0) ; } } ?>
Using PHP’s true multi-process operating mode, it is suitable for data collection, mass mailing, data source update, tcp server, etc.
PHP has a set of process control functions (-enable-pcntl and posix extensions are required when compiling), which allows PHP to create child processes, use the exec function to execute programs, handle signals and other functions like C in *nix systems. PCNTL uses ticks as a signal handle callback mechanism, which can minimize the load when processing asynchronous events. What are ticks? Tick is an event that occurs every time the interpreter executes N low-level statements in a code segment. This code segment needs to be specified through declare.
Commonly used PCNTL functions
1. pcntl_alarm (int $seconds)
Set a counter to send SIGALRM signal after $seconds seconds
2. pcntl_signal ( int $signo , callback $handler [, bool $restart_syscalls ] )
Set a callback function for $signo to handle the signal. The following is an example of sending a SIGALRM signal every 5 seconds, getting it by the signal_handler function, and then printing a "Caught SIGALRM":
<?php declare(ticks = 1); function signal_handler($signal) { print "Caught SIGALRM\n"; pcntl_alarm(5); } pcntl_signal(SIGALRM, "signal_handler", true); pcntl_alarm(5); for(;;) { } ?>
3. pcntl_exec ( string $path [, array $args [, array $envs ]] )
Execute the specified program in the current process space, similar to the exec family function in c. The so-called current space is the space where the code of the specified program is loaded and overwrites the current process. The process ends after executing the program.
<?php $dir = '/home/shankka/'; $cmd = 'ls'; $option = '-l'; $pathtobin = '/bin/ls'; $arg = array($cmd, $option, $dir); pcntl_exec($pathtobin, $arg); echo '123'; //不会执行到该行 ?>
4. pcntl_fork (void)
Create a child process for the current process, and run the parent process first. What is returned is the PID of the child process, which must be greater than zero. In the code of the parent process, you can use pcntl_wait (&$status) to pause the parent process until its child process has a return value. Note: Blocking of the parent process will also block the child process. However, the end of the parent process does not affect the operation of the child process.
After the parent process has finished running, it will then run the child process. At this time, the child process will start executing from the statement that executes pcntl_fork() (including this function), but at this time it returns zero (meaning that this is a child process). It is best to have an exit statement in the code block of the child process, that is, it will end immediately after executing the child process. Otherwise it will start executing parts of the script over again.
Note two points:
<?php $pid = pcntl_fork(); //这里最好不要有其他的语句 if ($pid == -1) { die('could not fork'); } else if ($pid) { // we are the parent pcntl_wait($status); //Protect against Zombie children } else { // we are the child } ?>
5. pcntl_wait ( int &$status [, int $options ] )
Block the current process until a child process of the current process exits or receives a signal to end the current process. Use $status to return the status code of the child process, and you can specify a second parameter to indicate whether to call in a blocking state:
When called in blocking mode, the function return value is the pid of the child process. If there is no child process, the return value is -1;
Called in a non-blocking manner, the function can also return 0 when a child process is running but has not ended.
6. pcntl_waitpid ( int $pid , int &$status [, int $options ] )
The function is the same as pcntl_wait, the difference is that waitpid is the child process waiting for the specified pid. When pid is -1, pcntl_waitpid is the same as pcntl_wait. The status information of the child process is stored in $status in the pcntl_wait and pcntl_waitpid functions. This parameter can be used in functions such as pcntl_wifexited, pcntl_wifstopped, pcntl_wifsignaled, pcntl_wexitstatus, pcntl_wtermsig, pcntl_wstopsig, and pcntl_waitpid.
For example:
<?php $pid = pcntl_fork(); if($pid) { pcntl_wait($status); $id = getmypid(); echo "parent process,pid {$id}, child pid {$pid}\n"; }else{ $id = getmypid(); echo "child process,pid {$id}\n"; sleep(2); } ?>
The child process sleeps for 2 seconds before ending after outputting words such as child process, while the parent process blocks until the child process exits before continuing to run.
7. pcntl_getpriority ([ int $pid [, int $process_identifier ]] )
Get the priority of the process, that is, the nice value, which defaults to 0. In Linux in my test environment (CentOS release 5.2 (Final)), the priority is -20 to 19, -20 is the highest priority, 19 is the lowest. (-20 to 20 in the manual).
8. pcntl_setpriority ( int $priority [, int $pid [, int $process_identifier ]] )
Set the priority of the process.
9. posix_kill
Can send signals to processes
10. pcntl_singal
Callback function used to set the signal
When the parent process exits, how does the child process know the parent process's exit
When the parent process exits, the child process can generally learn that the parent process has exited through the following two relatively simple methods:
当父进程退出时,会有一个INIT进程来领养这个子进程。这个INIT进程的进程号为1,所以子进程可以通过使用getppid()来取得当前父进程的pid。如果返回的是1,表明父进程已经变为INIT进程,则原进程已经推出。
使用kill函数,向原有的父进程发送空信号(kill(pid, 0))。使用这个方法对某个进程的存在性进行检查,而不会真的发送信号。所以,如果这个函数返回-1表示父进程已经退出。
除了上面的这两个方法外,还有一些实现上比较复杂的方法,比如建立管道或socket来进行时时的监控等等。
PHP多进程采集数据的例子
<?php /** * Project: Signfork: php多线程库 * File: Signfork.class.php */ class Signfork{ /** * 设置子进程通信文件所在目录 * @var string */ private $tmp_path='/tmp/'; /** * Signfork引擎主启动方法 * 1、判断$arg类型,类型为数组时将值传递给每个子进程;类型为数值型时,代表要创建的进程数. * @param object $obj 执行对象 * @param string|array $arg 用于对象中的__fork方法所执行的参数 * 如:$arg,自动分解为:$obj->__fork($arg[0])、$obj->__fork($arg[1])... * @return array 返回 array(子进程序列=>子进程执行结果); */ public function run($obj,$arg=1){ if(!method_exists($obj,'__fork')){ exit("Method '__fork' not found!"); } if(is_array($arg)){ $i=0; foreach($arg as $key=>$val){ $spawns[$i]=$key; $i++; $this->spawn($obj,$key,$val); } $spawns['total']=$i; }elseif($spawns=intval($arg)){ for($i = 0; $i < $spawns; $i++){ $this->spawn($obj,$i); } }else{ exit('Bad argument!'); } if($i>1000) exit('Too many spawns!'); return $this->request($spawns); } /** * Signfork主进程控制方法 * 1、$tmpfile 判断子进程文件是否存在,存在则子进程执行完毕,并读取内容 * 2、$data收集子进程运行结果及数据,并用于最终返回 * 3、删除子进程文件 * 4、轮询一次0.03秒,直到所有子进程执行完毕,清理子进程资源 * @param string|array $arg 用于对应每个子进程的ID * @return array 返回 array([子进程序列]=>[子进程执行结果]); */ private function request($spawns){ $data=array(); $i=is_array($spawns)?$spawns['total']:$spawns; for($ids = 0; $ids<$i; $ids++){ while(!($cid=pcntl_waitpid(-1, $status, WNOHANG)))usleep(30000); $tmpfile=$this->tmp_path.'sfpid_'.$cid; $data[$spawns['total']?$spawns[$ids]:$ids]=file_get_contents($tmpfile); unlink($tmpfile); } return $data; } /** * Signfork子进程执行方法 * 1、pcntl_fork 生成子进程 * 2、file_put_contents 将'$obj->__fork($val)'的执行结果存入特定序列命名的文本 * 3、posix_kill杀死当前进程 * @param object $obj 待执行的对象 * @param object $i 子进程的序列ID,以便于返回对应每个子进程数据 * @param object $param 用于输入对象$obj方法'__fork'执行参数 */ private function spawn($obj,$i,$param=null){ if(pcntl_fork()===0){ $cid=getmypid(); file_put_contents($this->tmp_path.'sfpid_'.$cid,$obj->__fork($param)); posix_kill($cid, SIGTERM); exit; } } } ?>
php在pcntl_fork()后生成的子进程(通常为僵尸进程)必须由pcntl_waitpid()函数进行资源释放。但在 pcntl_waitpid()不一定释放的就是当前运行的进程,也可能是过去生成的僵尸进程(没有释放);也可能是并发时其它访问者的僵尸进程。但可以使用posix_kill($cid, SIGTERM)在子进程结束时杀掉它。
子进程会自动复制父进程空间里的变量。
PHP多进程编程示例2
<?php //..... //需要安装pcntl的php扩展,并加载它 if(function_exists("pcntl_fork")){ //生成子进程 $pid = pcntl_fork(); if($pid == -1){ die('could not fork'); }else{ if($pid){ $status = 0; //阻塞父进程,直到子进程结束,不适合需要长时间运行的脚本,可使用pcntl_wait($status, 0)实现非阻塞式 pcntl_wait($status); // parent proc code exit; }else{ // child proc code //结束当前子进程,以防止生成僵尸进程 if(function_exists("posix_kill")){ posix_kill(getmypid(), SIGTERM); }else{ system('kill -9'. getmypid()); } exit; } } }else{ // 不支持多进程处理时的代码在这里 } //..... ?> 如果不需要阻塞进程,而又想得到子进程的退出状态,则可以注释掉pcntl_wait($status)语句,或写成: <?php pcntl_wait($status, 1); //或 pcntl_wait($status, WNOHANG); ?>
在上面的代码中,如果父进程退出(使用exit函数退出或redirect),则会导致子进程成为僵尸进程(会交给init进程控制),子进程不再执行。
僵尸进程是指的父进程已经退出,而该进程dead之后没有进程接受,就成为僵尸进程.(zombie)进程。任何进程在退出前(使用exit退出) 都会变成僵尸进程(用于保存进程的状态等信息),然后由init进程接管。如果不及时回收僵尸进程,那么它在系统中就会占用一个进程表项,如果这种僵尸进程过多,最后系统就没有可以用的进程表项,于是也无法再运行其它的程序。
预防僵尸进程有以下几种方法:
1. 父进程通过wait和waitpid等函数使其等待子进程结束,然后再执行父进程中的代码,这会导致父进程挂起。上面的代码就是使用这种方式实现的,但在WEB环境下,它不适合子进程需要长时间运行的情况(会导致超时)。
使用wait和waitpid方法使父进程自动回收其僵尸子进程(根据子进程的返回状态),waitpid用于临控指定子进程,wait是对于所有子进程而言。
2. 如果父进程很忙,那么可以用signal函数为SIGCHLD安装handler,因为子进程结束后,父进程会收到该信号,可以在handler中调用wait回收
3. 如果父进程不关心子进程什么时候结束,那么可以用signal(SIGCHLD, SIG_IGN)通知内核,自己对子进程的结束不感兴趣,那么子进程结束后,内核会回收,并不再给父进程发送信号,例如:
<?php pcntl_signal(SIGCHLD, SIG_IGN); $pid = pcntl_fork(); //....code ?>
4. 还有一个技巧,就是fork两次,父进程fork一个子进程,然后继续工作,子进程再fork一个孙进程后退出,那么孙进程被init接管,孙进程结束后,init会回收。不过子进程的回收还要自己做。下面是一个例子:
#include "apue.h" #include <sys/wait.h> int main(void){ pid_t pid; if ((pid = fork()) < 0){ err_sys("fork error"); } else if (pid == 0){ /**//* first child */ if ((pid = fork()) < 0){ err_sys("fork error"); }elseif(pid > 0){ exit(0); /**//* parent from second fork == first child */ } /** * We're the second child; our parent becomes init as soon * as our real parent calls exit() in the statement above. * Here's where we'd continue executing, knowing that when * we're done, init will reap our status. */ sleep(2); printf("second child, parent pid = %d ", getppid()); exit(0); } if (waitpid(pid, NULL, 0) != pid) /**//* wait for first child */ err_sys("waitpid error"); /** * We're the parent (the original process); we continue executing, * knowing that we're not the parent of the second child. */ exit(0); }
在fork()/execve()过程中,假设子进程结束时父进程仍存在,而父进程fork()之前既没安装SIGCHLD信号处理函数调用 waitpid()等待子进程结束,又没有显式忽略该信号,则子进程成为僵尸进程,无法正常结束,此时即使是root身份kill-9也不能杀死僵尸进程。补救办法是杀死僵尸进程的父进程(僵尸进程的父进程必然存在),僵尸进程成为”孤儿进程”,过继给1号进程init,init会定期调用wait回收清理这些父进程已退出的僵尸子进程。
所以,上面的示例可以改成:
<?php //..... //需要安装pcntl的php扩展,并加载它 if(function_exists("pcntl_fork")){ //生成第一个子进程 $pid = pcntl_fork(); //$pid即所产生的子进程id if($pid == -1){ //子进程fork失败 die('could not fork'); }else{ if($pid){ //父进程code sleep(5); //等待5秒 exit(0); //或$this->_redirect('/'); }else{ //第一个子进程code //产生孙进程 if(($gpid = pcntl_fork()) < 0){ ////$gpid即所产生的孙进程id //孙进程产生失败 die('could not fork'); }elseif($gpid > 0){ //第一个子进程code,即孙进程的父进程 $status = 0; $status = pcntl_wait($status); //阻塞子进程,并返回孙进程的退出状态,用于检查是否正常退出 if($status ! = 0) file_put_content('filename', '孙进程异常退出'); //得到父进程id //$ppid = posix_getppid(); //如果$ppid为1则表示其父进程已变为init进程,原父进程已退出 //得到子进程id:posix_getpid()或getmypid()或是fork返回的变量$pid //kill掉子进程 //posix_kill(getmypid(), SIGTERM); exit(0); }else{ //即$gpid == 0 //孙进程code //.... //结束孙进程(即当前进程),以防止生成僵尸进程 if(function_exists('posix_kill')){ posix_kill(getmypid(), SIGTERM); }else{ system('kill -9'. getmypid()); } exit(0); } } } }else{ // 不支持多进程处理时的代码在这里 } //..... ?>
怎样产生僵尸进程的
一个进程在调用exit命令结束自己的生命的时候,其实它并没有真正的被销毁,而是留下一个称为僵尸进程(Zombie)的数据结构(系统调用exit,它的作用是使进程退出,但也仅仅限于将一个正常的进程变成一个僵尸进程,并不能将其完全销毁)。在Linux进程的状态中,僵尸进程是非常特殊的一种,它已经放弃了几乎所有内存空间,没有任何可执行代码,也不能被调度,仅仅在进程列表中保留一个位置,记载该进程的退出状态等信息供其他进程收集,除此之外,僵尸进程不再占有任何内存空间。它需要它的父进程来为它收尸,如果他的父进程没安装SIGCHLD信号处理函数调用wait或waitpid()等待子进程结束,又没有显式忽略该信号,那么它就一直保持僵尸状态,如果这时父进程结束了,那么init进程自动会接手这个子进程,为它收尸,它还是能被清除的。但是如果如果父进程是一个循环,不会结束,那么子进程就会一直保持僵尸状态,这就是为什么系统中有时会有很多的僵尸进程。
任何一个子进程(init除外)在exit()之后,并非马上就消失掉,而是留下一个称为僵尸进程(Zombie)的数据结构,等待父进程处理。这是每个子进程在结束时都要经过的阶段。如果子进程在exit()之后,父进程没有来得及处理,这时用ps命令就能看到子进程的状态是”Z”。如果父进程能及时 处理,可能用ps命令就来不及看到子进程的僵尸状态,但这并不等于子进程不经过僵尸状态。
如果父进程在子进程结束之前退出,则子进程将由init接管。init将会以父进程的身份对僵尸状态的子进程进行处理。
另外,还可以写一个php文件,然后在以后台形式来运行它,例如:
<?php //Action代码 public function createAction(){ //.... //将args替换成要传给insertLargeData.php的参数,参数间用空格间隔 system('php -f insertLargeData.php ' . ' args ' . '&'); $this->redirect('/'); } ?>
然后在insertLargeData.php文件中做数据库操作。也可以用cronjob + php的方式实现大数据量的处理。
如果是在终端运行php命令,当终端关闭后,刚刚执行的命令也会被强制关闭,如果你想让其不受终端关闭的影响,可以使用nohup命令实现:
<?php //Action代码 public function createAction(){ //.... //将args替换成要传给insertLargeData.php的参数,参数间用空格间隔 system('nohup php -f insertLargeData.php ' . ' args ' . '&'); $this->redirect('/'); } ?>
你还可以使用screen命令代替nohup命令。