This article mainly introduces Baidu engineers’ talk about the implementation principles and performance analysis of PHP functions (3). This article explains the implementation and introduction of commonly used PHP functions, and makes a summary and suggestions. Friends in need can refer to it
Implementation and introduction of commonly used PHP functions
count
Count is a function we often use. Its function is to return the length of an array.
What is the complexity of the count function? A common saying is that the count function will traverse the entire array and then find the number of elements, so the complexity is O(n). Is this the actual situation? Let’s go back to the implementation of count. Through the source code, we can find that for the count operation of the array, the final path of the function is zif_count-> php_count_recursive-> zend_hash_num_elements, and the behavior of zend_hash_num_elements is return ht->nNumOfElements, it can be seen that this is an O(1) rather than O(n) operation. In fact, the array is a hash_table at the bottom of PHP. For the hash table, there is a special element nNumOfElements in zend to record the number of current elements, so for general count, this value is actually returned directly. From this, we draw the conclusion: count has a complexity of O(1) and has nothing to do with the size of the specific array.
How does count behave for non-array type variables? For unset variables, it returns 0, while int, double, string, etc. will return 1
strlen
Strlen is used to return the length of a string. So, what is its implementation principle? We all know that strlen is an o(n) function in c, which will sequentially traverse the string until it encounters
isset and array_key_existsThe most common use of these two functions is to determine whether a key exists in the array. But the former can also be used to determine whether a variable has been set. As mentioned earlier, isset is not a real function, so its efficiency will be much higher than the latter. It is recommended to use this instead of array_key_exists.
array_push and array[]
Both of them append an element to the end of the array. The difference is that the former can push multiple ones at one time. The biggest difference between them is that one is a function and the other is a language structure, so the latter is more efficient. Therefore, if you are just appending elements normally, it is recommended to use array [].
rand and mt_rand
Both provide the function of generating random numbers. The former uses libc standard rand. The latter uses known features in Mersenne Twister as a random number generator, which can generate random values on average four times faster than rand() provided by libc. Therefore, if you have higher performance requirements, you can consider using mt_rand instead of the former. We all know that rand generates pseudo-random numbers. In C, you need to use srand to display the specified seed. But in php, rand will call srand once by default for you. Under normal circumstances, there is no need to explicitly call it yourself. It should be noted that if srand needs to be called under special circumstances, it must be called in conjunction. That is to say, srand corresponds to rand, and mt_srand corresponds to srand. They must not be mixed, otherwise they will be invalid.
sort and usort
Both are used for sorting. The difference is that the former can specify a sorting strategy, similar to our qsort and C sort in C. In terms of sorting, both are implemented using standard quick sort. For those who have sorting requirements, just call these methods provided by PHP unless there are special circumstances. There is no need to re-implement it yourself, and the efficiency will be much lower. The reason can be seen in the previous analysis and comparison of user functions and built-in functions.
Urlencode and rawurlencode
Both of these are used for url encoding. All non-alphanumeric characters in the string except -_. will be replaced with a percent sign (%) followed by two hexadecimal digits. The only difference between the two is that for spaces, urlencode will be encoded as , while rawurlencode will be encoded as . In general, except for search engines, our strategy is to encode spaces as . Therefore, the latter is mostly used. Note that the encode and decode series must be used together.
strcmp series functions
This series of functions include strcmp, strncmp, strcasecmp, strncasecmp, and their implementation functions are the same as C functions. But there are differences, since php strings are allowed
is_int and is_numeric
The functions of these two functions are similar but not exactly the same. You must pay attention to their differences when using them. Is_int: Determine whether a variable type is an integer type. There is a special field representation type in PHP variables, so you can directly judge this type. It is an absolute O(1) operation. Is_numeric: Determine whether a variable is an integer or a numeric string. , that is to say, in addition to integer variables that will return true, for string variables, if they are in the form of "1234", "1e4", etc., they will also be judged as true. At this time, the string will be traversed for judgment.
Summary and suggestions
Summary:
Through the principle analysis and performance testing of function implementation, we have concluded the following conclusions
1. Php’s function calling overhead is relatively large.
2. Function-related information is stored in a large hash_table. Each time it is called, the function name is searched in the hash table. Therefore, the length of the function name also has a certain impact on performance.
3. Function return references have no practical meaning
4. The performance of built-in PHP functions is much higher than that of user functions, especially for string operations.
5. The efficiency of class methods, ordinary functions, and static methods is almost the same, there is not much difference
6. Excluding the impact of empty function calls, the performance of built-in functions is basically the same as that of C functions with the same function.
7. All parameter transfers are shallow copies using reference counting, and the cost is very small.
8. The impact of the number of functions on performance is almost negligible
Recommendation:
Therefore, for the use of php functions, there are some suggestions as follows
1. A function can be completed with a built-in function. Try to use it instead of writing the php function yourself.
2. If a certain function has high performance requirements, you can consider using extensions to implement it.
3. Php function calls are expensive, so don’t over-encapsulate them. If some functions need to be called a lot and can be implemented with only 1 or 2 lines of code, it is recommended not to encapsulate the calls.
4. Don’t be overly obsessed with various design patterns. As described in the previous article, excessive encapsulation will cause performance degradation. There are trade-offs between the two that need to be considered. Php has its own characteristics, and you must not imitate the Java model too much.
5. Functions should not be nested too deeply, and recursion should be used with caution.
6. Pseudo functions have high performance and will be given priority when implementing the same functions. For example, use isset instead of array_key_exists
7. Returning a reference from a function does not make much sense and has no practical effect. It is recommended not to consider it.
8. Class member methods are no less efficient than ordinary functions, so there is no need to worry about performance loss. It is recommended to consider more static methods, which have better readability and security.
9. Unless there is a special need, it is recommended to use pass-by-value instead of pass-by-reference when passing parameters. Of course, if the parameter is a large array and needs to be modified, you can consider passing it by reference.