This article mainly introduces Baidu engineers’ talk about the implementation principles and performance analysis of PHP functions (1), Friends in need can refer to it
Foreword
In any language, functions are the most basic building blocks. For PHP functions, what are its characteristics? How are function calls implemented? How is the performance of PHP functions? What are the usage suggestions? This article will analyze based on the principles and combine it with actual performance tests to try to answer these questions. After understanding the implementation At the same time, you can better write PHP programs. At the same time, some common PHP functions will be introduced.
Classification of php functions
In PHP, if divided horizontally, functions are divided into two categories: user function (built-in function) and internal function (built-in function). The former are some functions and methods customized by users in the program, and the latter are various library functions provided by PHP itself (such as sprintf, array_push, etc.). Users can also write library functions through extension methods, which will be introduced later. For user function, it can be subdivided into function (function) and method (class method). In this article, these three functions will be analyzed and tested respectively.
Implementation of php function
How is a PHP function ultimately executed? What is the process like?
To answer this question, let’s first take a look at the process of executing the PHP code.
As you can see from Figure 1, PHP implements a typical dynamic language execution process: after getting a piece of code, after lexical analysis, syntax analysis and other stages, the source program will be translated into instructions (opcodes). The ZEND virtual machine then executes these instructions in sequence to complete the operation. Php itself is implemented in C, so the functions ultimately called are all C functions. In fact, we can regard PHP as a software developed in C. It is not difficult to see from the above description that the execution of functions in PHP is also translated into opcodes for calling. Each function call actually executes one or more instructions.
For each function, zend is described by the following data structure
Copy the code. The code is as follows:
typedef union _zend_function {
zend_uchar type; /* MUST be the first element of this struct! */
struct {
zend_uchar type; /* never used */
Char *function_name;
zend_class_entry *scope;
zend_uint fn_flags;
union _zend_function *prototype;
zend_uint num_args;
zend_uint required_num_args;
zend_arg_info *arg_info;
zend_bool pass_rest_by_reference;
unsigned char return_reference;
} common;
zend_op_array op_array;
zend_internal_function internal_function;
} zend_function;
typedef struct _zend_function_state {
HashTable *function_symbol_table;
zend_function *function;
void *reserved[ZEND_MAX_RESERVED_RESOURCES];
} zend_function_state;
Where type indicates the type of function: user function, built-in function, overloaded function. Common contains the basic information of the function, including function name, parameter information, function flags (ordinary functions, static methods, abstract methods), etc. In addition, for user functions, there is also a function symbol table that records internal variables, etc., which will be detailed later. Zend maintains a global function_table, which is a large hash table. When a function is called, the corresponding zend_function will first be found from the table based on the function name. When making a function call, the virtual machine determines the calling method based on the type. Different types of functions have different execution principles.
Built-in functions
Built-in functions are essentially real C functions. For each built-in function, PHP will expand into a function named zif_xxxx after final compilation. For example, our common sprintf corresponds to zif_sprintf at the bottom layer. When Zend is executing, if it finds a built-in function, it simply performs a forwarding operation.
Zend provides a series of APIs for calling, including parameter acquisition, array operations, memory allocation, etc. The parameters of the built-in function are obtained through the zend_parse_parameters method. For parameters such as arrays and strings, zend implements shallow copying, so this efficiency is very high. It can be said that for PHP built-in functions, their efficiency is almost the same as that of the corresponding C functions, with the only additional forwarding call.
Built-in functions are dynamically loaded in PHP through so. Users can also write corresponding so according to their needs, which is what we often call extensions. ZEND provides a series of APIs for expansion
User function
Compared with built-in functions, user-defined functions implemented through PHP have completely different execution processes and implementation principles. As mentioned above, we know that PHP code is translated into opcodes for execution, and user functions are no exception. In fact, each function corresponds to a set of opcodes, and this set of instructions is saved in zend_function. Therefore, the call to the user function ultimately corresponds to the execution of a set of opcodes.
》》Save local variables and implement recursion
We know that function recursion is completed through the stack. In php, a similar method is used to achieve this. Zend assigns an active symbol table (active_sym_table) to each PHP function to record the status of all local variables in the current function. All symbol tables are maintained in the form of a stack. Whenever a function is called, a new symbol table is allocated and pushed onto the stack. When the call ends, the current symbol table is popped off the stack. This enables state preservation and recursion.
Regarding stack maintenance, zend has made optimizations here. Pre-allocate a static array of length N to simulate the stack. This method of simulating dynamic data structures through static arrays is also often used in our own programs. This method avoids the memory allocation caused by each call. destroy. ZEND just cleans the symbol table data on the top of the current stack at the end of the function call. Because the length of the static array is N, once the function call level exceeds N, the program will not cause stack overflow. In this case, zend will allocate and destroy the symbol table, which will cause a lot of performance degradation. In zend, the current value of N is 32. Therefore, when we write PHP programs, it is best not to exceed 32 function call levels. Of course, if it is a web application, the function call level itself can be deep.
》》Transfer of parameters is different from the built-in function calling zend_parse_params to obtain parameters. The acquisition of parameters in user functions is completed through instructions. How many parameters a function has corresponds to how many instructions it has. Specific to implementation, it is ordinary variable assignment. From the above analysis, it can be seen that compared with the built-in functions, since the stack table is maintained by itself, and the execution of each instruction is also a C function, the performance of the user function will be relatively much worse. There will be a specific comparative analysis later. Therefore, if a function has a corresponding PHP built-in function, try not to rewrite the function yourself to implement it.