Big chain stores have a big problem. Every day, thousands of transactions occur in every store. Company executives want to mine this data. Which products sell well? What's bad? Where do organic products sell well? How are ice cream sales going?
In order to capture this data, organizations must load all transactional data into a data model that is better suited to generating the types of reports the company requires. However, this takes time, and as the chain grows, it can take more than a day to process a day's worth of data. So, this is a big problem.
Now, your web application may not need to process this much data, but there is a chance that any site will take longer to process than your customers are willing to wait. Generally speaking, the time a customer is willing to wait is 200 milliseconds. If it exceeds this time, the customer will feel that the process is "slow". This number is based on desktop applications, whereas the web makes us more patient. But no matter what, you shouldn't make your customers wait longer than a few seconds. So, adopt some strategies to handle batch jobs in PHP.
Decentralized way with cron
On UNIX® machines, the core program that performs batch processing is the cron daemon. The daemon reads a configuration file that tells it which command lines to run and how often. The daemon then executes them as configured. When an error is encountered, it can even send error output to a specified email address to help debug the problem.
I know some engineers who strongly advocate the use of threading technology. "Threads! Threads are the real way to do background processing. The cron daemon is so outdated."
I don’t think so.
I have used both methods, and I think cron has the advantage of the "Keep It Simple, Stupid (KISS, simple is beautiful)" principle. It keeps background processing simple. Instead of writing a multi-threaded job processing application that runs all the time (so there are no memory leaks), cron starts a simple batch script. This script determines whether there is a job to process, executes the job, and then exits. No need to worry about memory leaks. There's also no need to worry about threads stalling or getting stuck in infinite loops.
So, how does cron work? This depends on your system environment. I'm only discussing the UNIX command line version of old simple cron, you can ask your system administrator how to implement it in your own web applications.
Here is a simple cron configuration that runs a PHP script at 11pm every night:
0 23 * * * jack /usr/bin/php /users/home/jack/myscript.php
The first 5 fields define when the script should be started. Then the username that should be used to run this script. The remaining commands are the command lines to be executed. The time fields are minutes, hours, day of month, month, and day of week. Here are a few examples.
Command:
15 * * * * jack /usr/bin/php /users/home/jack/myscript.php
Run the script at the 15th minute of every hour.
Command:
15,45 * * * * jack /usr/bin/php /users/home/jack/myscript.php
Run the script at the 15th and 45th minutes of each hour.
Command:
*/1 3-23 * * * jack /usr/bin/php /users/home/jack/myscript.php
Run the script every minute between 3am and 11pm.
Command
30 23 * * 6 jack /usr/bin/php /users/home/jack/myscript.php
Run the script every Saturday at 11:30 PM (Saturday is specified by 6).
As you can see, the number of combinations is unlimited. You can control when the script is run as needed. You can also specify multiple scripts to run, so that some scripts can be run every minute, while other scripts (such as backup scripts) can be run only once a day.
In order to specify the email address to which reported errors are sent, you can use the MAILTO directive as follows:
MAILTO=jherr@pobox.com
Note: For Microsoft® Windows® users, there is an equivalent Scheduled Tasks system that can be used to launch command line processes (such as PHP scripts) at regular intervals.