This article brings you an introduction (pictures and texts) about PHP7 updates and performance optimization. It has certain reference value. Friends in need can refer to it. I hope it will be helpful to you. helped.
PHP7 Innovation and Performance Optimization
I was fortunate to participate in the 2015 PHP Technology Summit (PHPCON) and listened to Brother Niao (Hui Xinchen) about the new features of PHP7 And the sharing of performance optimizations is exciting. Brother Niao is the most authoritative PHP expert in China, and his sharing has many very valuable things. I compiled the shared PPT and collected relevant information into this interpretive technical article, hoping that it can be useful to students who are doing PHP development. some help.
PHP has gone through 20 years of history. Until today, PHP7 has released an RC version. It is said that the official version of PHP7 should be released around November 2015. PHP7 can be said to be a large-scale innovation compared to the previous series of PHP5.*, especially in terms of performance, which has achieved a significant improvement by leaps and bounds.
PHP is a Web development language that is widely used around the world. The innovation of PHP7 will certainly bring more profound changes to these Web services. Here is a chart from Niao Ge’s PPT (82% of Web sites use PHP as a development language):
(Note: A web site can use multiple languages As its development language)
(Note: This article contains many screenshots from Niao Ge’s PPT. The copyright of the pictures belongs to Niao Ge)
Let’s take a look at two exciting performances first Test result diagram:
Benchmark comparison (picture from PPT):
Performance test results of PHP7, performance stress test results, time consumption dropped from 2.991 to 1.186, a significant drop of 60%.
QPS stress test of WordPress (picture from PPT):
In the WordPress project, compared with PHP5.6, PHP7 has a QPS increase of 2.77 times.
After reading the exciting comparison of performance test results, let’s get to the point. There are many new features in PHP7, but we will focus more on the major changes.
1. New features and changes
1. Scalar Type Declarations & Scalar Type Declarations
PHP language A very important feature is "weak typing", which makes PHP programs very easy to write, and novices can get started quickly when they come into contact with PHP. However, it is also accompanied by some controversy. Supporting the definition of variable types can be said to be an innovative change. PHP begins to support type definitions in an optional way. In addition, a switch instruction declare(strict_type=1); is also introduced. Once this instruction is turned on, it will force the program under the current file to follow strict function parameter transfer types and return types.
For example, an add function plus a type definition can be written like this:
If combined with the mandatory type switch instruction, it can be written like this:
If strict_type is not turned on, PHP will try to help you convert it to the required type. After turning it on, PHP will no longer perform type conversion, and a type mismatch will be thrown. mistake. This is great news for students who like "strongly typed" languages.
More detailed introduction:
PHP7 Scalar Type Declaration RFC [Translation]
2. More Errors become catchable Exceptions
PHP7 implements a global throwable interface. The original Exception and some Errors implement this interface (interface), and define the inheritance structure of exceptions in the form of interfaces. As a result, more Errors in PHP7 become catchable Exceptions and are returned to developers. If they are not caught, they are Errors. If they are caught, they become Exceptions that can be handled within the program. These catchable Errors are usually Errors that will not cause fatal harm to the program, such as a function that does not exist. PHP7 further facilitates developers' processing and gives developers greater control over the program. Because by default, Error will directly cause the program to interrupt, and PHP7 provides the ability to capture and process it, allowing the program to continue executing, providing programmers with more flexible choices.
For example, to execute a function that we are not sure whether it exists or not, the PHP5-compatible method is to append the judgment function_exist before the function is called, while PHP7 supports the handling method of catching Exception.
The example in the picture below (the screenshot is from the PPT):
3. AST (Abstract Syntax Tree, Abstract Syntax Tree)
AST As a middleware in the PHP compilation process, it replaces the original method of spitting out opcode directly from the interpreter and decouples the interpreter (parser) and compiler (compliler), which can reduce some Hack codes and make the implementation easier to understand. and maintainable.
PHP5:
PHP7:
More AST information:
https://wiki.php.net/rfc/abstract_syntax_tree
4. Native TLS (Native Thread local storage, native thread local storage)
PHP in multi-threaded mode (for example, The woker and event modes of the web server Apache are multi-threaded) and need to solve the problem of "thread safety" (TS, Thread Safe). Because threads share the memory space of the process, each thread itself needs to pass some way. Build a private space to save your own private data to avoid mutual contamination with other threads. The method adopted by PHP5 is to maintain a large global array and allocate an independent storage space to each thread. Threads access this global data group through their own key values.
This unique key value needs to be passed to every function that needs to use global variables in PHP5. PHP7 believes that this method of passing is not friendly and has some problems. Therefore, try to use a global thread-specific variable to save this key value.
Related Native TLS issues:
https://wiki.php.net/rfc/native-tls
5. Other new features
There are many new features and changes in PHP7, so we won’t go into detail here.
(1) Int64 support, unifies the integer length under different platforms, and both string and file uploads support greater than 2GB.
(2) Uniform variable syntax.
(3) Consistently foreach behaviors
(4) New operators <=>, ??
(5) Unicode characters Format support (\u{xxxxx})
(6) Anonymous Class support (Anonymous Class)
… …
2. Leap-forward performance breakthrough: Full speed ahead
1. JIT and performance
Just In Time (just-in-time compilation) is a software optimization technology, which means that the bytecode is compiled into machine code during runtime. From an intuitive point of view, it is easy for us to think that machine code can be directly recognized and executed by computers, and it is more efficient than Zend to read opcode and execute it one by one. Among them, HHVM (HipHop Virtual Machine, HHVM is a Facebook open source PHP virtual machine) uses JIT, which improves their PHP performance test by an order of magnitude and releases a shocking test result, which also makes us intuitively think that JIT is A powerful technology that turns stone into gold.
In fact, in 2013, Brother Niao and Dmitry (one of the PHP language core developers) once made a JIT attempt on the PHP5.5 version (it was not released). The original execution process of PHP5.5 is to compile the PHP code into opcode bytecode through lexical and syntactic analysis (the format is somewhat similar to assembly). Then, the Zend engine reads these opcode instructions and parses and executes them one by one.
And they introduced type inference (TypeInf) after the opcode link, and then generated ByteCodes through JIT, and then executed.
As a result, exciting results were obtained in the benchmark (test program). After implementing JIT, the performance increased by 8 times compared with PHP5.5. However, when they put this optimization into the actual project WordPress (an open source blogging project), they saw almost no performance improvement and got a puzzling test result.
So, they used the profile type tool under Linux to analyze the CPU time consumption of program execution.
Distribution of CPU consumption when executing WordPress 100 times (screenshot from PPT):
Note:
21% of CPU time is spent on Memory management.
12% of CPU time is spent on hash table operations, mainly adding, deleting, modifying and checking PHP arrays.
30% of CPU time is spent in built-in functions, such as strlen.
25% of CPU time is spent in VM (Zend Engine).
After analysis, two conclusions were drawn:
(1) If the ByteCodes generated by JIT are too large, it will cause the CPU cache hit rate to decrease (CPU Cache Miss)
In PHP5.5 code, because there is no obvious type definition, we can only rely on type inference. Define the variable types that can be inferred as much as possible, and then, combined with type inference, remove branch codes that are not of that type and generate directly executable machine code. However, type inference cannot infer all types. In WordPress, less than 30% of the type information that can be inferred is limited, and the branch code that can be reduced is limited. As a result, after JIT, machine code is directly generated, and the generated ByteCodes are too large, eventually causing a significant decrease in CPU cache hits (CPU Cache Miss).
CPU cache hit means that when the CPU reads and executes instructions, if the required data cannot be read in the CPU's first-level cache (L1), it has to continue searching downwards. To the second level cache (L2) and the third level cache (L3), it will eventually try to find the required instruction data in the memory area, and the read time difference between the memory and the CPU cache can reach 100 times. Therefore, if the ByteCodes are too large and the number of executed instructions is too large, the multi-level cache cannot accommodate so much data, and some instructions will have to be stored in the memory area.
The size of the cache at all levels of the CPU is also limited. The following picture is the configuration information of Intel i7 920:
Therefore, the decrease in CPU cache hit rate will cause a serious increase in time consumption. On the other hand, the performance improvement brought by JIT is also offset by it.
Through JIT, the overhead of VM can be reduced. At the same time, through instruction optimization, the development of memory management can be indirectly reduced because the number of memory allocations can be reduced. However, for real WordPress projects, only 25% of the CPU time is spent on the VM, and the main problem and bottleneck is not actually on the VM. Therefore, the JIT optimization plan was not included in the PHP7 features of this version. However, it is likely to be implemented in a later version, which is worth looking forward to.
(2) The improvement effect of JIT performance depends on the actual bottleneck of the project
JIT has been significantly improved in the benchmark because the amount of code is relatively small and the final generated ByteCodes are also relatively small. , and the main overhead is in the VM. However, there is no obvious performance improvement in the actual WordPress project because the code volume of WordPress is much larger than that of the benchmark. Although JIT reduces the overhead of the VM, it causes a decrease in CPU cache hits and extra memory because the ByteCodes are too large. Overhead, ultimately, there is no improvement.
Different types of projects will have different CPU overhead ratios and will get different results. Performance testing without actual projects is not very representative.
2. Changes in Zval
In fact, the actual storage carrier of various types of variables in PHP is Zval, which is characterized by its tolerance and tolerance. Essentially, it is a structure (struct) implemented in C language. For students who write PHP, you can roughly understand it as something similar to an array.
PHP5's Zval, the memory occupies 24 bytes (screenshot from PPT):
PHP7's Zval, the memory occupies 16 bytes (screenshot from PPT):
Zval dropped from 24 bytes to 16 bytes. Why did it drop? Here you need to add a little bit of C language foundation to assist those who are not familiar with C. students understand. There is a slight difference between struct and union (union). Each member variable of Struct occupies an independent memory space, while the member variables in union share a memory space (that is to say, if one of the member variables is modified, the public space will be After modification, there will be no records of other member variables). Therefore, although there appear to be a lot more member variables, the actual memory space occupied has decreased.
In addition, there are other features that have been significantly changed. Some simple types no longer use references.
Zval structure diagram (from PPT):
Zval in the picture is composed of two 64bits (1 byte = 8bit, bit is "bit"). If the variable type is long or bealoon, which does not exceed 64bit in length, it will be stored directly in the value, and there will be no following quoted. When the variable type is array, objec, string, etc. that exceeds 64 bits, the value stored is a pointer pointing to the real storage structure address.
For simple variable types, Zval storage becomes very simple and efficient.
Types that do not require references: NULL, Boolean, Long, Double
Types that require references: String, Array, Object, Resource, Reference
3. Internal type zend_string
Zend_string is the structure that actually stores strings. The actual content will be stored in val (char, character type), and val is a char array with a length of 1 (convenient for member variable occupancy).
The last member variable of the structure uses a char array instead of char*. Here is a small optimization trick that can reduce CPU cache misses.
If you use a char array, when malloc applies for the memory of the above structure, it applies in the same area, usually the length is sizeof(_zend_string) actual char storage space. However, if you use char*, what is stored in this location is only a pointer, and the actual storage is in another independent memory area.
Comparison of memory allocation using char[1] and char*:
From the perspective of logical implementation, there is actually not much difference between the two. The effect is very similar. In fact, when these memory blocks are loaded into the CPU, they look very different. Because the former is the same piece of memory allocated continuously together, it can usually be obtained together when the CPU reads it (because it will be in the same level cache). The latter, because it contains data from two memories, when the CPU reads the first memory, it is very likely that the second memory data is not in the same level cache, so the CPU has to search below L2 (secondary cache), or even to The desired second piece of memory data is found in the memory area. This will cause CPU Cache Miss, and the time-consuming difference between the two can be up to 100 times.
In addition, when copying strings, using reference assignment, zend_string can avoid memory copies.
6. Changes in PHP arrays (HashTable and Zend Array)
In the process of writing PHP programs, the most frequently used type is arrays. PHP5 arrays are implemented using HashTable. To put it in a rough summary, it is a HashTable that supports doubly linked lists. It not only supports hash mapping to access elements through array keys, but can also traverse array elements by accessing doubly linked lists through foreach.
PHP5 HashTable (screenshot from PPT):
This picture looks very complicated, with various pointers jumping around. When we pass the key When accessing the content of an element by value, sometimes it takes three pointer jumps to find the required content. The most important point is that the storage of these array elements is scattered in different memory areas. In the same way, when the CPU reads, because they are likely not in the same level cache, the CPU will have to search in the lower-level cache or even the memory area, which will cause the CPU cache hit to decrease, thereby increasing more consumption. hour.
Zend Array of PHP7 (screenshot from PPT):
The new version of the array structure is very simple and eye-catching. The biggest feature is that the entire array elements and hash mapping table are all connected together and allocated in the same memory. If you are traversing a simple type array of integers, the efficiency will be very fast, because the array elements (Bucket) themselves are continuously allocated in the same memory, and the zval of the array elements will store the integer elements internally. There is also a pointer external link, and all data is stored in the current memory area. Of course, the most important thing is that it can avoid CPU Cache Miss (CPU cache hit rate decrease).
Changes in Zend Array:
(1) The value of the array defaults to zval.
(2) The size of HashTable dropped from 72 to 56 bytes, a reduction of 22%.
(3) The size of Buckets dropped from 72 to 32 bytes, a reduction of 50%.
(4) The memory space of the buckets of array elements is allocated together.
(5) The key of the array element (Bucket.key) points to zend_string.
(6) The value of the array element is embedded in the Bucket.
(7) Reduce CPU Cache Miss.
7. Function Calling Convention
PHP7 has improved the function calling mechanism. By optimizing the parameter transfer process, it has reduced some instructions and improved execution efficiency.
PHP5’s function calling mechanism (screenshot from PPT):
In the figure, the instructions send_val and recv parameters in the vm stack are the same , PHP7 achieves the underlying optimization of the function calling mechanism by reducing these two duplications.
PHP7’s function calling mechanism (screenshot from PPT):
8. Through macro definition and inline function (inline ), allowing the compiler to complete part of the work in advance
The macro definition of C language will be executed in the preprocessing stage (compilation stage), part of the work will be completed in advance, and there is no need to allocate memory when the program is running, and similar functions can be implemented function, but without the overhead of stack pushing and popping of function calls, the efficiency will be higher. The same is true for inline functions. In the preprocessing stage, the functions in the program are replaced with function bodies. When the actual running program is executed here, there will be no overhead of function calls.
PHP7 has made a lot of optimizations in this area and put a lot of work that needs to be performed in the running phase into the compilation phase. For example, parameter type judgment (Parameters Parsing), because all involved here are fixed character constants, can be completed in the compilation stage, thereby improving subsequent execution efficiency.
For example, in the figure below, the way to handle the type of passed parameters is optimized from the writing method on the left to the macro writing method on the right.
3. Summary
Brother Niao’s PPT has released a set of comparative data, which is that WordPress executes 100 times in PHP5.6 This will generate 7 billion CPU instruction executions, while in PHP7 it only requires 2.5 billion times, a reduction of 64.2%. This is a shocking data.
In the entire sharing of Brother Bird, the most profound point of view for me is: pay attention to details, many small optimizations, accumulate bit by bit continuously, add up to something, and finally converge into amazing results. . I think it is probably the same reason to build a mountain with nine people.
There is no doubt that PHP7 has achieved leapfrog improvements in performance. If these results can be applied to PHP's Web system, perhaps we only need fewer machines to support higher request volume. services. The release of the official version of PHP7 is full of endless expectations.
The above is the detailed content of Introduction to PHP7 updates and performance optimization (pictures and text). For more information, please follow other related articles on the PHP Chinese website!