I started running a data list a few days ago. The list requires user names, whether they have mobile phone numbers, and whether they have email addresses. I easily obtained the user list. However, there are as many as 20 million users in the list, and I have to check whether the user has Whether there is a mobile phone number and an email address must be requested one by one through a secure interface that is open to the outside world, and then the return value can be analyzed to know.
The following is the solution I dealt with:
1. Save the 2000w list to a temporary data table
2. Use a PHP program to retrieve the list every time Get 500 users from the table, and generate the original SQL update record after detection
3. In order to prevent the PHP program from suddenly disconnecting, use a shell script to detect every 1 minute. If PHP fails, restart it
I use the shell script as The reason for the daemon process is that the detection interface between mobile phones and mailboxes is slow, and it is impossible to detect 20 million users in 1 to 2 days.
Details of the plan:
1. Temporarily save the user list table users, the table structure is as follows:
Copy Code The code is as follows:
CREATE TABLE `users` (
`account` varchar(50) COMMENT 'username',
`has_phone` tinyint(3 ) unsigned NOT NULL default '0' COMMENT 'Do you have a mobile phone number',
`has_email` tinyint(3) unsigned NOT NULL default '0' COMMENT 'Do you have an email',
`flag` tinyint(3) unsigned NOT NULL default '0' COMMENT 'flag',
PRIMARY KEY (`account`),
KEY `flag` (`flag`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT=' List table';
I first imported more than 2,000 user names into this temporary table. The two fields of has_phone and has_email are both 0 (none) by default. The flag indicates whether the user has Testing completed.
The following is part of the table data:
9873aaa,0,0,0
adddwwwd876222,0,0,0
testalexlee,0,0,0
codejia. net,0,0,0
haohdouywaa21,0,0,0
2. PHP script check_users.php
After importing the user list into the table, write another A simple PHP script, the idea is as follows: each loop retrieves 500 users with flag=0 from the table, then requests the interface to determine whether the user has a mobile phone number and email address, generates a SQL, saves it to an SQLS array, and waits for 500 After all users have been tested, loop the SQLS array, update the 500 lists in the table, and set the flag flag to 1, indicating that the test has been completed and will not be obtained next time.
Since the PHP script code is long, here is a simple code description:
Copy the code The code is as follows:
class Users{
private $data;
private $sqls;
private $nums; //Determine whether there are 500 users
private $total_nums; //Currently detected Completed number of users
//Get 500 users each time
private function getUsers(){...}
//Detect these 500 users and generate SQL
private function checkUserInfo(){...}
//Update these 500 users
private function updateUserInfo(){...}
//Run
public function run(){
$flag = true;
while($flag){
if($this->nums != 500){ $flag = false; }
if($this ->total_nums == 10000){
; $this->checkUserInfo();
$this->updateUserInfo(); }
}
$user = new Users();
$user->run();
?>
The above is a concise version of the PHP script. You probably get the idea. The initial version did not have the $total_nums variable. This is because when I first started running this script, I found that it only ran out of more than 40,000 scripts before it failed. Later, At first glance, it turned out that the script was hanging there because the connection to the database failed. Adding this variable cannot solve this problem, but after running 10,000 users each time, the PHP script exits and is restarted by the following shell script.
3. Shell script as a daemon
I added this shell script to crontab and executed it every 1 minute. This shell script is very simple. It detects whether the process id exists in check_users.php , if it exists, it means that the PHP script is still running, and the shell script does not do anything; if it does not exist, it means that the PHP script has finished exit(0) and the 1w user has exited, then the shell script starts the script and enters the next 1w Detection of user lists.
I mentioned above that if the PHP script cannot connect to the database when it is connected, PHP will always hang there and cannot exit. I added a time detection in the shell script. When the PHP script process exists, calculate how long it has existed. If it exceeds the time I expected, kill the PHP script and restart it.
For example data starting with
, the results are similar to the following:
testalexlee,1,0,1
codejia.net,0,0,1
haohdouywaa21,1,1,1
9873aaa,0 ,1,1
adddwwwd876222,1,0,1
Finally: The above user list data is just an example, don’t take it too seriously. With 20 million data, I estimate it will take a while because the detection interface is relatively slow. After receiving the request, you need to connect the table, look up the table, and then return. In fact, the best way is to pull a list directly from the table requested by the interface, and then use shell commands to process it, and you will get the result quickly. But this is the case in the company, some things are not open, you know~~~
http://www.bkjia.com/PHPjc/328044.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/328044.htmlTechArticleA few days ago I started running a data list. The list needs to provide a user name, whether there is a mobile phone number, and whether there is an email address. , I easily obtained the user list, but there are as many as 20 million users...