Last revised 28 Feb 2005. If you want to see what has changed, search for this date in this article. If you like this article, visit my blog, PHP Everywhere for related articles. A HOWTO on Optimizing PHPPHP is a very fast programming language, but there is more to optimizing PHP than just speed of code execution. In this chapter, we explain why optimizing PHP involves many factors which are not code related, and why tuning PHP requires an understanding of how PHP performs in relation to all the other subsystems on your server, and then identifying bottlenecks caused by these subsystems and fixing them. We also cover how to tune and optimize your PHP scripts so they run even faster. Achieving High Performance When we talk about good performance, we are not talking about how fast your PHP scripts will run. Performance is a set of tradeoffs between scalability and speed. Scripts tuned to use fewer resources might be slower than scripts that perform caching, but more copies of the same script can be run at one time on a web server. In the example below, A.php is a sprinter that can run fast, and B.php is a marathon runner than can jog forever at the nearly the same speed. For light loads, A.php is substantially faster, but as the web traffic increases, the performance of B.php only drops a little bit while A.php just runs out of steam. Let us take a more realistic example to clarify matters further. Suppose we need to write a PHP script that reads a 250K file and generates a HTML summary of the file. We write 2 scripts that do the same thing: hare.php that reads the whole file into memory at once and processes it in one pass, and tortoise.php that reads the file, one line at time, never keeping more than the longest line in memory. Tortoise.php will be slower as multiple reads are issued, requiring more system calls. Hare.php requires 0.04 seconds of CPU and 10 Mb RAM and tortoise.php requires 0.06 seconds of CPU and 5 Mb RAM. The server has 100 Mb free actual RAM and its CPU is 99% idle. Assume no memory fragmentation occurs to simplify things. At 10 concurrent scripts running, hare.php will run out of memory (10 x 10 = 100). At that point, tortoise.php will still have 50 Mb of free memory. The 11th concurrent script to run will bring hare.php to its knees as it starts using virtual memory, slowing it down to maybe half its original speed; each invocation of hare.php now takes 0.08 seconds of CPU time. Meanwhile, tortoise.php will be still be running at its normal 0.06 seconds CPU time. In the table below, the faster php script for different loads is in bold:
As the above example shows, obtaining good performance is not merely writing fast PHP scripts. High performance PHP requires a good understanding of the underlying hardware, the operating system and supporting software such as the web server and database. Bottlenecks The hare and tortoise example has shown us that bottlenecks cause slowdowns. With infinite RAM, hare.php will always be faster than tortoise.php. Unfortunately, the above model is a bit simplistic and there are many other bottlenecks to performance apart from RAM: (a) Networking Your network is probably the biggest bottleneck. Let us say you have a 10 Mbit link to the Internet, over which you can pump 1 megabyte of data per second. If each web page is 30k, a mere 33 web pages per second will saturate the line. More subtle networking bottlenecks include frequent access to slow network services such as DNS, or allocating insufficient memory for networking software. (b) CPU If you monitor your CPU load, sending plain HTML pages over a network will not tax your CPU at all because as we mentioned earlier, the bottleneck will be the network. However for the complex dynamic web pages that PHP generates, your CPU speed will normally become the limiting factor. Having a server with multiple processors or having a server farm can alleviate this. (c) Shared Memory Shared memory is used for inter-process communication, and to store resources that are shared between multiple processes such as cached data and code. If insufficient shared memory is allocated any attempt to access resources that use shared memory such as database connections or executable code will perform poorly. (d) File System Accessing a hard disk can be 50 to 100 times slower than reading data from RAM. File caches using RAM can alleviate this. However low memory conditions will reduce the amount of memory available for the file-system cache, slowing things down. File systems can also become heavily fragmented, slowing down disk accesses. Heavy use of symbolic links on Unix systems can slow down disk accesses too. Default Linux installs are also notorious for setting hard disk default settings which are tuned for compatibility and not for speed. Use the command hdparm to tune your Linux hard disk settings. (e) Process Management On some operating systems such as Windows creating new processes is a slow operation. This means CGI applications that fork a new process on every invocation will run substantially slower on these operating systems. Running PHP in multi-threaded mode should improve response times (note: older versions of PHP are not stable in multi-threaded mode). Avoid overcrowding your web server with too many unneeded processes. For example, if your server is purely for web serving, avoid running (or even installing) X-Windows on the machine. On Windows, avoid running Microsoft Find Fast (part of Office) and 3-dimensional screen savers that result in 100% CPU utilization. Some of the programs that you can consider removing include unused networking protocols, mail servers, antivirus scanners, hardware drivers for mice, infrared ports and the like. On Unix, I assume you are accessing your server using SSH. Then you can consider removing: deamons such as telnetd, inetd, atd, ftpd, lpd, sambad You can also disable at startup various programs by modifying the startup files which are usually stored in the /etc/init* or /etc/rc*/init* directory. Also review your cron jobs to see if you can remove them or reschedule them for off-peak periods. (f) Connecting to Other Servers If your web server requires services running on other servers, it is possible that those servers become the bottleneck. The most common example of this is a slow database server that is servicing too many complicated SQL requests from multiple web servers.
Tuning Your Web Server for PHP We will cover how to get the best PHP performance for the two most common web servers in use today, Apache 1.3 and IIS. A lot of the advice here is relevant for serving HTML also. The authors of PHP have stated that there is no performance nor scalability advantage in using Apache 2.0 over Apache 1.3 with PHP, especially in multi-threaded mode. When running Apache 2.0 in pre-forking mode, the following discussion is still relevant (21 Oct 2003). (a) Apache 1.3/2.0 Apache is available on both Unix and Windows. It is the most popular web server in the world. Apache 1.3 uses a pre-forking model for web serving. When Apache starts up, it creates multiple child processes that handle HTTP requests. The initial parent process acts like a guardian angel, making sure that all the child processes are working properly and coordinating everything. As more HTTP requests come in, more child processes are spawned to process them. As the HTTP requests slow down, the parent will kill the idle child processes, freeing up resources for other processes. The beauty of this scheme is that it makes Apache extremely robust. Even if a child process crashes, the parent and the other child processes are insulated from the crashing child. The pre-forking model is not as fast as some other possible designs, but to me that it is "much ado about nothing" on a server serving PHP scripts because other bottlenecks will kick in long before Apache performance issues become significant. The robustness and reliability of Apache is more important. Apache 2.0 offers operation in multi-threaded mode. My benchmarks indicate there is little performance advantage in this mode. Also be warned that many PHP extensions are not compatible (e.g. GD and IMAP). Tested with Apache 2.0.47 (21 Oct 2003). Apache is configured using the httpd.conf file. The following parameters are particularly important in configuring child processes:
For large sites, values close to the following might be better: MinSpareServers 32 MaxSpareServers 64 Apache on Windows behaves differently. Instead of using child processes, Apache uses threads. The above parameters are not used. Instead we have one parameter: ThreadsPerChild which defaults to 50. This parameter sets the number of threads that can be spawned by Apache. As there is only one child process in the Windows version, the default setting of 50 means only 50 concurrent HTTP requests can be handled. For web servers experiencing higher traffic, increase this value to between 256 to 1024. Other useful performance parameters you can change include:
If you do not require DNS lookups and you are not using the htaccess file to configure Apache settings for individual directories you can set: # disable DNS lookups: PHP scripts only get the IP address HostnameLookups off # disable htaccess checks AllowOverride none
If you are not worried about the directory security when accessing symbolic links, turn on FollowSymLinks and turn off SymLinksIfOwnerMatch to prevent additional lstat() system calls from being made: Options FollowSymLinks #Options SymLinksIfOwnerMatch (b) IIS Tuning IIS is a multi-threaded web server available on Windows NT and 2000. From the Internet Services Manager, it is possible to tune the following parameters:
You can also configure the default isolation level of your web site. In the Home Directory tab under Application Protection, you can define your level of isolation. A highly isolated web site will run slower because it is running as a separate process from IIS, while running web site in the IIS process is the fastest but will bring down the server if there are serious bugs in the web site code. Currently I recommend running PHP web sites using CGI, or using ISAPI with Application Protection set to high. You can also use regedit.exe to modify following IIS 5 registry settings stored at the following location: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Inetinfo\Parameters\
If the settings are missing from this registry location, the defaults are being used. High Performance on Windows: IIS and FastCGI After much testing, I find that the best PHP performance on Windows is offered by using IIS with FastCGI. CGI is a protocol for calling external programs from a web server. It is not very fast because CGI programs are terminated after every page request. FastCGI modifies this protocol for high performance, by making the CGI program persist after a page request, and reusing the same CGI program when a new page request comes in. As the installation of FastCGI with IIS is complicated, you should use the EasyWindows PHP Installer. This will install PHP, FastCGI and Turck MMCache for the best performance possible. This installer can also install PHP for Apache 1.3/2.0. This section on FastCGI added 21 Oct 2003. PHP4's Zend Engine The Zend Engine is the internal compiler and runtime engine used by PHP4. Developed by Zeev Suraski and Andi Gutmans, the Zend Engine is an abbreviation of their names. In the early days of PHP4, it worked in the following fashion: The PHP script was loaded by the Zend Engine and compiled into Zend opcode. Opcodes, short for operation codes, are low level binary instructions. Then the opcode was executed and the HTML generated sent to the client. The opcode was flushed from memory after execution. Today, there are a multitude of products and techniques to help you speed up this process. In the following diagram, we show the how modern PHP scripts work; all the shaded boxes are optional. PHP Scripts are loaded into memory and compiled into Zend opcodes. These opcodes can now be optimized using an optional peephole optimizer called Zend Optimizer. Depending on the script, it can increase the speed of your PHP code by 0-50%. Formerly after execution, the opcodes were discarded. Now the opcodes can be optionally cached in memory using several alternative open source products and the Zend Accelerator (formerly Zend Cache), which is a commercial closed source product. The only opcode cache that is compatible with the Zend Optimizer is the Zend Accelerator. An opcode cache speeds execution by removing the script loading and compilation steps. Execution times can improve between 10-200% using an opcode cache.
One of the secrets of high performance is not to write faster PHP code, but to avoid executing PHP code by caching generated HTML in a file or in shared memory. The PHP script is only run once and the HTML is captured, and future invocations of the script will load the cached HTML. If the data needs to be updated regularly, an expiry value is set for the cached HTML. HTML caching is not part of the PHP language nor Zend Engine, but implemented using PHP code. There are many class libraries that do this. One of them is the PEAR Cache, which we will cover in the next section. Another is the Smarty template library. Finally, the HTML sent to a web client can be compressed. This is enabled by placing the following code at the beginning of your PHP script:
ob_start("ob_gzhandler"); : ?> If your HTML is highly compressible, it is possible to reduce the size of your HTML file by 50-80%, reducing network bandwidth requirements and latencies. The downside is that you need to have some CPU power to spare for compression. HTML Caching with PEAR Cache The PEAR Cache is a set of caching classes that allows you to cache multiple types of data, including HTML and images. The most common use of the PEAR Cache is to cache HTML text. To do this, we use the Output buffering class which caches all text printed or echoed between the start() and end() functions: require_once("Cache/Output.php"); $cache = new Cache_Output("file", array("cache_dir" => "cache/") ); if ($contents = $cache->start(md5("this is a unique key!"))) { # print $contents; Cache Hit ";} else { # print " Don't leave home without it… "; # place in cacheprint " Stand and deliver "; # place in cacheprint $cache->end(10); } Since I wrote these lines, a superior PEAR cache system has been developed: Cache Lite; and for more sophisticated distributed caching, see memcached (Added 28 Feb 2005). The Cache constructor takes the storage driver to use as the first parameter. File, database and shared memory storage drivers are available; see the pear/Cache/Container directory. Benchmarks by Ulf Wendel suggest that the "file" storage driver offers the best performance. The second parameter is the storage driver options. The options are "cache_dir", the location of the caching directory, and "filename_prefix", which is the prefix to use for all cached files. Strangely enough, cache expiry times are not set in the options parameter. To cache some data, you generate a unique id for the cached data using a key. In the above example, we used md5("this is a unique key!"). The start() function uses the key to find a cached copy of the contents. If the contents are not cached, an empty string is returned by start(), and all future echo() and print() statements will be buffered in the output cache, until end() is called. The end() function returns the contents of the buffer, and ends output buffering. The end() function takes as its first parameter the expiry time of the cache. This parameter can be the seconds to cache the data, or a Unix integer timestamp giving the date and time to expire the data, or zero to default to 24 hours. Another way to use the PEAR cache is to store variables or other data. To do so, you can use the base Cache class:
require_once("Cache.php"); $cache = new Cache("file", array("cache_dir" => "cache/") ); if ($data = $cache->get($id)) { print "Cache hit. } else { $data = "The quality of mercy is not strained..."; } ?> To save the data we use save(). If your unique key is already a legal file name, you can bypass the generateID() step. Objects and arrays can be saved because save() will serialize the data for you. The last parameter controls when the data expires; this can be the seconds to cache the data, or a Unix integer timestamp giving the date and time to expire the data, or zero to use the default of 24 hours. To retrieve the cached data we use get(). You can delete a cached data item using $cache->delete($id) and remove all cached items using $cache->flush(). New: A faster Caching class is Cache-Lite. Highly recommended. Using Benchmarks In earlier section we have covered many performance issues. Now we come to the meat and bones, how to go about measuring and benchmarking your code so you can obtain decent information on what to tune. If you want to perform realistic benchmarks on a web server, you will need a tool to send HTTP requests to the server. On Unix, common tools to perform benchmarks include ab (short for apachebench) which is part of the Apache release, and the newer flood (httpd.apache.org/test/flood). On Windows NT/2000 you can use Microsoft's free Web Application Stress Tool (webtool.rte.microsoft.com). These programs can make multiple concurrent HTTP requests, simulating multiple web clients, and present you with detailed statistics on completion of the tests. You can monitor how your server behaves as the benchmarks are conducted on Unix using "vmstat 1". This prints out a status report every second on the performance of your disk i/o, virtual memory and CPU load. Alternatively, you can use "top d 1" which gives you a full screen update on all processes running sorted by CPU load every 1 second. On Windows 2000, you can use the Performance Monitor or the Task Manager to view your system statistics. If you want to test a particular aspect of your code without having to worry about the HTTP overhead, you can benchmark using the microtime(), which returns the current time accurate to the microsecond as a string. The following function will convert it into a number suitable for calculations. function getmicrotime() $time = getmicrotime(); # echo " Time elapsed: ",getmicrotime() - $time, " seconds"; Alternatively, you can use a profiling tool such as APD or XDebug. Also see my article squeezing code with xdebug. Benchmarking Case Study This case study details a real benchmark we did for a client. In this instance, the customer wanted a guaranteed response time of 5 seconds for all PHP pages that did not involve running long SQL queries. The following server configuration was used: an Apache 1.3.20 server running PHP 4.0.6 on Red Hat 7.2 Linux. The hardware was a twin Pentium III 933 MHz beast with 1 Gb of RAM. The HTTP requests will be for the PHP script "testmysql.php". This script reads and processes about 20 records from a MySQL database running on another server. For the sake of simplicity, we assume that all graphics are downloaded from another web server. We used "ab" as the benchmarking tool. We set "ab" to perform 1000 requests (-n1000), using 10 simultaneous connections (-c10). Here are the results: # ab -n1000 -c10 http://192.168.0.99/php/testmysql.phpThis is ApacheBench, Version 1.3Copyright (c) 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/Copyright (c) 1998-1999 The Apache Group, http://www.apache.org/Server Software: Apache/1.3.20Server Hostname: 192.168.0.99Server Port: 80Document Path: /php/testmysql.phpDocument Length: 25970 bytesConcurrency Level: 10Time taken for tests: 128.672 secondsComplete requests: 1000Failed requests: 0Total transferred: 26382000 bytesHTML transferred: 25970000 bytesRequests per second: 7.77Transfer rate: 205.03 kb/s receivedConnnection Times (ms) min avg maxConnect: 0 9 114Processing: 698 1274 2071Total: 698 1283 2185 登入後複製 While running the benchmark, on the server side we monitored the resource utilization using the command "top d 1". The parameters "d 1" mean to delay 1 second between updates. The output is shown below. 10:58pm up 3:36, 2 users, load average: 9.07, 3.29, 1.7974 processes: 63 sleeping, 11 running, 0 zombie, 0 stoppedCPU0 states: 92.0% user, 7.0% system, 0.0% nice, 0.0% idleCPU1 states: 95.0% user, 4.0% system, 0.0% nice, 0.0% idleMem: 1028484K av, 230324K used, 798160K free, 64K shrd, 27196K buffSwap: 2040244K av, 0K used, 2040244K free 30360K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 1142 apache 20 0 7280 7280 3780 R 21.2 0.7 0:20 httpd 1154 apache 17 0 8044 8044 3788 S 19.3 0.7 0:20 httpd 1155 apache 20 0 8052 8052 3796 R 19.3 0.7 0:20 httpd 1141 apache 15 0 6764 6764 3780 S 14.7 0.6 0:20 httpd 1174 apache 14 0 6848 6848 3788 S 12.9 0.6 0:20 httpd 1178 apache 13 0 6864 6864 3804 S 12.9 0.6 0:19 httpd 1157 apache 15 0 7536 7536 3788 R 11.0 0.7 0:19 httpd 1159 apache 15 0 7540 7540 3788 R 11.0 0.7 0:19 httpd 1148 apache 11 0 6672 6672 3784 S 10.1 0.6 0:20 httpd 1158 apache 14 0 7400 7400 3788 R 10.1 0.7 0:19 httpd 1163 apache 20 0 7540 7540 3788 R 10.1 0.7 0:19 httpd 1169 apache 12 0 6856 6856 3796 S 10.1 0.6 0:20 httpd 1176 apache 16 0 8052 8052 3796 R 10.1 0.7 0:19 httpd 1171 apache 15 0 7984 7984 3780 S 9.2 0.7 0:18 httpd 1170 apache 16 0 7204 7204 3796 R 6.4 0.7 0:20 httpd 1168 apache 10 0 6856 6856 3796 S 4.6 0.6 0:20 httpd 1377 natsoft 11 0 1104 1104 856 R 2.7 0.1 0:02 top 1152 apache 9 0 6752 6752 3788 S 1.8 0.6 0:20 httpd 1167 apache 9 0 6848 6848 3788 S 0.9 0.6 0:19 httpd 1 root 8 0 520 520 452 S 0.0 0.0 0:04 init 2 root 9 0 0 0 0 SW 0.0 0.0 0:00 keventd 登入後複製 Looking at the output of "top", the twin CPU Apache server is running flat out with 0% idle time. What is worse is that the load average is 9.07 for the past minute (and 3.29 for the past 5 minutes, 1.79 for the past 15 minutes). The load average is the average number of processes that are ready to be run. For a twin processor server, any load above 2.0 means that the system is being overloaded. You might notice that there is a close relationship between load (9.07) and the number of simultaneous connections (10) that we have defined with ab. Luckily we have plenty of physical memory, with about 798,160 Mb free and no virtual memory used. Further down we can see the processes ordered by CPU utilization. The most active ones are the Apache httpd processes. The first httpd task is using 7280K of memory, and is taking an average of 21.2% of CPU and 0.7% of physical memory. The STAT column indicates the status: R is runnable, S is sleeping, and W means that the process is swapped out. Given the above figures, and assuming this a typical peak load, we can perform some planning. If the load average is 9.0 for a twin-CPU server and assuming each task takes about the same amount of time to complete, then a lightly loaded server should be 9.0 / 2 CPUs = 4.5 times faster. So a HTTP request that used to take 1.283 seconds to satisfy at peak load will take about 1.283 / 4.5 = 0.285 seconds to complete. To verify this, we benchmarked with 2 simultaneous client connections (instead of 10 in the previous benchmark) to give an average of 0.281 seconds, very close to the 0.285 seconds prediction! # ab -n100 -c2 http://192.168.0.99/php/testmysql.php [ some lines omitted for brevity ]Requests per second: 7.10Transfer rate: 187.37 kb/s receivedConnnection Times (ms) min avg maxConnect: 0 2 40Processing: 255 279 292Total: 255 281 332 登入後複製 Conversely, doubling the connections, we can predict that the average connection time should double from 1.283 to 2.566 seconds. In the benchmarks, the actual time was 2.570 seconds. Overload on 40 connections When we pushed the benchmark to use 40 connections, the server overloaded with 35% failed requests. On further investigation, it was because the MySQL server persistent connects were failing because of "Too Many Connections". The benchmark also demonstrates the lingering behavior of Apache child processes. Each PHP script uses 2 persistent connections, so at 40 connections, we should only be using at most 80 persistent connections, well below the default MySQL max_connections of 100. However Apache idle child processes are not assigned immediately to new requests due to latencies, keep-alives and other technical reasons; these lingering child processes held the remaining 20+ persistent connections that were "the straws that broke the Camel's back". The Fix By switching to non-persistent database connections, we were able to fix this problem and obtained a result of 5.340 seconds. An alternative solution would have been to increase the MySQL max_connections parameter from the default of 100. Conclusions The above case study once again shows us that optimizing your performance is extremely complex. It requires an understanding of multiple software subsystems including network routing, the TCP/IP stack, the amount of physical and virtual memory, the number of CPUs, the behavior of Apache child processes, your PHP scripts, and the database configuration. In this case the PHP code was quite well tuned, so the first bottleneck was the CPU, which caused a slowdown in response time. As the load increased, the system slowed down in a near linear fashion (which is a good sign) until we encountered the more serious bottleneck of MySQL client connections. This caused multiple errors in our PHP pages until we fixed it by switching to non-persistent connections. From the above figures, we can calculate for a given desired response time, how many simultaneous HTTP connections we can handle. Assuming two-way network latencies of 0.5 seconds on the Internet (0.25s one way), we can predict: As our client wanted a maximum response time of 5 seconds, the server can handle up to 34 simultaneous connections per second. This works out to a peak capacity of 34/5 = 6.8 page views per second. To get the maximum number of page views a day that the server can handle, multiply the peak capacity per second by 50,000 (this technique is suggested by the webmasters at pair.com, a large web hosting company), to give 340,000 page views a day. Code Optimizations The patient reader who is still wondering why so much emphasis is given to discussing non-PHP issues is reminded that PHP is a fast language, and many of the likely bottlenecks causing slow speeds lie outside PHP. Most PHP scripts are simple. They involve reading some session information, loading some data from a content management system or database, formatting the appropriate HTML and echoing the results to the HTTP client. Assuming that a typical PHP script completes in 0.1 seconds and the Internet latency is 0.2 seconds, only 33% of the 0.3 seconds response time that the HTTP client sees is actual PHP computation. So if you improve a script's speed by 20%, the HTTP client will see response times drop to 0.28 seconds, which is an insignificant improvement. Of course the server can probably handle 20% more requests for the same page, so scalability has improved. The above example does not mean we should throw our hands up and give up. It means that we should not feel proud tweaking the last 1% of speed from our code, but we should spend our time optimizing worthwhile areas of our code to get higher returns. High Return Code Optimizations The places where such high returns are achievable are in the while and for loops that litter our code, where each slowdown in the code is magnified by the number of times we iterate over them. The best way of understanding what can be optimized is to use a few examples: Example 1 Here is one simple example that prints an array: for ($j=0; $j This can be substantially speeded up by changing the code to: for ($j=0, $max = sizeof($arr), $s = ''; $j $s .= $arr[$j]." echo $s; First we need to understand that the expression $j The second issue is that in PHP 4, echoing multiple times is slower than storing everything in a string and echoing it in one call. This is because echo is an expensive operation that could involve sending TCP/IP packets to a HTTP client. Of course accumulating the string in $s has some scalability issues as it will use up more memory, so you can see a trade-off is involved here. An alternate way of speeding the above code would be to use output buffering. This will accumulate the output string internally, and send the output in one shot at the end of the script. This reduces networking overhead substantially at the cost of more memory and an increase in latency. In some of my code consisting entirely of echo statements, performance improvements of 15% have been observed. ob_start(); Note that output buffering with ob_start() can be used as a global optimization for all PHP scripts. In long-running scripts, you will also want to flush the output buffer periodically so that some feedback is sent to the HTTP client. This can be done with ob_end_flush(). This function also turns off output buffering, so you might want to call ob_start() again immediately after the flush. In summary, this example has shown us how to optimize loop invariants and how to use output buffering to speed up our code. Example 2 In the following code, we iterate through a PEAR DB recordset, using a special formatting function to format a row, and then we echo the results. This time, I benchmarked the execution time at 10.2 ms (this excludes the database connection and SQL execution time): function FormatRow(&$recordSet) for ($j = 0; $j numRows(); $j++) { From example 1, we learnt that we can optimize the code by changing the code to the following (execution time: 8.7 ms): function FormatRow(&$recordSet) ob_start(); for ($j = 0, $max = $rs->numRows(); $j print FormatRow($rs); My benchmarks showed me that the use of $max contributed 0.5 ms and ob_start contributed 1 ms to the 1.5 ms speedup. However by changing the looping algorithm we can simplify and speed up the code. In this case, execution time is reduced to 8.5 ms: function FormatRow($arr) ob_start(); while ($arr = $rs->fetchRow()) { One last optimization is possible here. We can remove the overhead of the function call (potentially sacrificing maintainability for speed) to shave off another 0.1 milliseconds (execution time: 8.4 ms): ob_start(); while ($arr = $rs->fetchRow()) { By switching to PEAR Cache, execution time dropped again to 3.5 ms for cached data: require_once("Cache/Output.php"); ob_start(); $cache = new Cache_Output("file", array("cache_dir" => "cache/") ); $t = getmicrotime(); if ($contents = $cache->start(md5("this is a unique kexy!"))) { Cache Hit Cache Miss ## while ($arr = $rs->fetchRow()) { print $cache->end(100); print (getmicrotime()-$t); We summarize the optimization methods below:
From the above figures, you can see that biggest speed improvements are derived not from tweaking the code, but by simple global optimizations such as ob_start(), or using radically different algorithms such as HTML caching.
|