In-depth understanding of ini configuration in php (1)
This article will not describe in detail the purpose of a certain ini configuration item, which has been fully explained in the manual. I just want to dig into the implementation mechanism of PHP from a specific perspective, which will involve some knowledge of the PHP kernel:-)
Students who use PHP know that the php.ini configuration will take effect throughout the entire SAPI life cycle. During the execution of a php script, if you manually modify the ini configuration, it will not take effect. If you cannot restart apache or nginx at this time, you can only explicitly call the ini_set interface in the php code. ini_set is a function provided by PHP to dynamically modify the configuration. It should be noted that the configuration set by ini_set and the configuration set in the ini file have different effective time ranges. After the php script is executed, the ini_set settings will become invalid immediately.
Therefore, this article is divided into two parts. The first part explains the principle of php.ini configuration, and the second part talks about dynamically modifying the php configuration.
The configuration of php.ini will roughly involve three pieces of data, configuration_hash, EG (ini_directives), and PG, BG, PCRE_G, JSON_G, XXX_G, etc. It doesn’t matter if you don’t know the meaning of these three types of data, they will be explained in detail below.
1, parse INI configuration file
Since php.ini needs to be in effect during the SAPI process, the work of parsing the ini file and constructing the php configuration accordingly must be the beginning of SAPI. In other words, it must occur during the startup process of PHP. PHP requires that these configurations have been generated internally before any actual request arrives.
Reflected into the core of php, which is the php_module_startup function.
php_module_startup is mainly responsible for starting php. It is usually called when SAPI starts. btw, another common function is php_request_startup, which is responsible for initializing each request when it arrives. php_module_startup and php_request_startup are two iconic actions, but their analysis is beyond the scope of this article.
For example, when php is hooked into a module under apache, then when apache starts, all these modules will be activated, including the php module. When activating the php module, php_module_startup will be called. The php_module_startup function completes a lot of work. Once the php_module_startup call ends, it means, OK, php has been started and can now accept requests and respond.
In the php_module_startup function, the implementation related to parsing the ini file is:
/* this will read in php.ini, set up the configuration parameters,
load zend extensions and register php function extensions
to be loaded later */
if (php_init_config(TSRMLS_C) == FAILURE) {
return FAILURE;
}
As you can see, the php_init_config function is actually called to complete the parse of the ini file. The parse work mainly performs lex&grammar analysis, and extracts and saves the key and value pairs in the ini file. The format of php.ini is very simple, with key on the left side of the equal sign and value on the right side. Whenever a pair of kvs are extracted, where does php store them? The answer is the configuration_hash mentioned earlier.
static HashTable configuration_hash;
Configuration_hash is declared in php_ini.c, which is a HashTable type data structure. As the name suggests, it is actually a hash table. As an aside, configuration_hash cannot be obtained in versions before php5.3 because it is a static variable in the php_ini.c file. Later, php5.3 added the php_ini_get_configuration_hash interface, which directly returns &configuration_hash, so that each PHP extension can easily get a glimpse of the configuration_hash... What a great blessing...
Note four points:
First, php_init_config will not do any verification other than lexical and syntax. In other words, if we add a line hello=world to the ini file, as long as this is a correctly formatted configuration item, then the final configuration_hash will contain an element with the key hello and the value world, and the configuration_hash will reflect it to the maximum extent. ini file.
Second, the ini file allows us to configure in the form of an array. For example, write the following three lines in the ini file:
drift.arr[]=1
drift.arr[]=2
drift.arr[]=3
Then in the final generated configuration_hash table, there will be an element with the key drift.arr, and its value is an array containing three numbers: 1, 2, and 3. This is an extremely rare configuration method.
Thirdly, php also allows us to build some additional ini files in addition to the default php.ini file (php-%s.ini to be precise). These ini files will be placed in an additional directory. This directory is specified by the environment variable PHP_INI_SCAN_DIR. After php_init_config has parsed php.ini, it will scan this directory again and find all the .ini files in the directory for analysis. The kv key-value pairs generated in these additional ini files will also be added to the configuration_hash.
This is an occasionally useful feature. Suppose we develop a PHP extension ourselves but don’t want to mix the configuration into php.ini. We can write another ini and tell PHP where to find it through PHP_INI_SCAN_DIR. Of course, its disadvantages are also obvious, and it requires setting additional environment variables to support it. A better solution is for developers to call php_parse_user_ini_file or zend_parse_ini_file themselves in the extension to parse the corresponding ini file.
Fourth, in configuration_hash, the key is a string, so what is the type of the value? The answer is also a string (except for the very special array mentioned above). Specifically, such as the following configuration:
display_errors = On
log_errors = Off
log_errors_max_len = 1024
Then the key-value pair actually stored in the final configuration_hash is:
key: "display_errors"
val : "1"
key: "log_errors"
val : ""
key: "log_errors_max_len"
val : "1024"
Pay attention to log_errors, the value it stores is not even "0", it is a real empty string. In addition, log_errors_max_len is not a number, but a string of 1024.
At this point in the analysis, basically everything related to parsing the ini file has been explained clearly. To summarize briefly:
1, parsing ini occurs in the php_module_startup stage
2. The parsing results are stored in configuration_hash.
2, the configuration applies to the module
The general structure of PHP can be seen as a zend engine at the bottom, which is responsible for interacting with the OS, compiling PHP code, providing memory hosting, etc. On the upper layer of the zend engine, there are many modules arranged. The core module is the Core module, and others include Standard, PCRE, Date, Session, etc... These modules also have another name called php extension. We can simply understand that each module provides a set of functional interfaces for developers to call. For example, commonly used built-in functions such as explode, trim, array, etc. are provided by the Standard module.
Why we need to talk about these is because in php.ini, in addition to some configurations for php itself, that is, for the Core module (such as safe_mode, display_errors, max_execution_time, etc.), there are quite a few configurations for other different modules. of.
For example, the date module, which provides common date, time, strtotime and other functions. In php.ini, its related configuration looks like:
[Date]
;date.timezone = 'Asia/Shanghai'
;date.default_latitude = 31.7667
;date.default_longitude = 35.2333
;date.sunrise_zenith = 90.583333
;date.sunset_zenith = 90.583333
In addition to the independent configuration of these modules, the zend engine is also configurable, but the zend engine has very few configurable items, only error_reporting, zend.enable_gc and detect_unicode.
As we have mentioned in the previous section, php_module_startup will call php_init_config, whose purpose is to parse the ini file and generate configuration_hash. So what else will be done in php_module_startup next? Obviously, the configuration in configuration_hash will be applied to different modules such as Zend, Core, Standard, and SPL. Of course, this is not an overnight process, because PHP usually contains many modules, and these modules will also be started in sequence during PHP startup. Then, the process of configuring module A occurs during the startup process of module A.
Students with experience in extension development will point out directly that module A is started in PHP_MINIT_FUNCTION(A), isn't it?
Yes, if module A needs to be configured, then in PHP_MINIT_FUNCTION, you can call REGISTER_INI_ENTRIES() to complete it. REGISTER_INI_ENTRIES will search the configuration_hash for the configuration value set by the user based on the name of the configuration item required by the current module, and update it to the module's own global space.
2.1, global space of module
To understand how to apply the ini configuration from configuration_hash to each module, it is necessary to first understand the global space of the php module. For different PHP modules, you can open up a storage space of your own, and this space is globally visible to the module. Generally speaking, it will be used to store the ini configuration required by the module. In other words, the configuration items in configuration_hash will eventually be stored in the global space. During the execution of the module, you only need to directly access this global space to get the user's settings for the module. Of course, it is also often used to record intermediate data during the execution of the module.
Let’s take the bcmath module as an example. bcmath is a php module that provides an interface for mathematical calculations. First, let’s take a look at its ini configuration:
PHP_INI_BEGIN()
STD_PHP_INI_ENTRY("bcmath.scale", "0", PHP_INI_ALL, OnUpdateLongGEZero, bc_precision, zend_bcmath_globals, bcmath_globals)
PHP_INI_END()
bcmath has only one configuration item. We can use bcmath.scale in php.ini to configure the bcmath module.
Next, continue to look at the global space definition of the bcmatch module. There is the following statement in php_bcmath.h:
ZEND_BEGIN_MODULE_GLOBALS(bcmath)
bc_num _zero_;
bc_num _one_;
bc_num _two_;
long bc_precision;
ZEND_END_MODULE_GLOBALS(bcmath)
After the macro is expanded, it is:
typedef struct _zend_bcmath_globals {
bc_num _zero_;
bc_num _one_;
bc_num _two_;
long bc_precision;
} zend_bcmath_globals;
In fact, the zend_bcmath_globals type is the global space type in the bcmath module. Only the zend_bcmath_globals structure is declared here, and there is a specific instantiation definition in bcmath.c:
// After expansion, it is zend_bcmath_globals bcmath_globals;
ZEND_DECLARE_MODULE_GLOBALS(bcmath)
It can be seen that the definition of variable bcmath_globals is completed with ZEND_DECLARE_MODULE_GLOBALS.
bcmath_globals is a real global space, which contains four fields. Its last field, bc_precision, corresponds to bcmath.scale in the ini configuration. We set the value of bcmath.scale in php.ini, and then when starting the bcmath module, the value of bcmath.scale is updated to bcmath_globals.bc_precision.
Update the value in configuration_hash to the xxx_globals variable defined by each module, which is what is called applying the ini configuration to the module. Once the module is started, these configurations are in place. Therefore, in the subsequent execution phase, the php module does not need to access the configuration_hash again. The module only needs to access its own XXX_globals to get the configuration set by the user.
bcmath_globals, in addition to one field for the ini configuration item, what are the other three fields? This is the second role of the module global space. In addition to being used for ini configuration, it can also store some data during module execution.
Another example is the json module, which is also a very commonly used module in PHP:
ZEND_BEGIN_MODULE_GLOBALS(json)
int error_code;
ZEND_END_MODULE_GLOBALS(json)
You can see that the json module does not require ini configuration, and its global space has only one field error_code. error_code records the errors that occurred in the last execution of json_decode or json_encode. The json_last_error function returns this error_code to help users locate the cause of the error.
In order to easily access module global space variables, PHP has conventionally proposed some macros. For example, if we want to access the error_code in json_globals, we can of course write it directly as json_globals.error_code (not available in a multi-threaded environment), but a more general way of writing it is to define the JSON_G macro:
#define JSON_G(v) (json_globals.v)
We use JSON_G(error_code) to access json_globals.error_code. At the beginning of this article, I mentioned PG, BG, JSON_G, PCRE_G, XXX_G, etc. These macros are also very common in PHP source code. Now we can easily understand them. The PG macro can access the global variables of the Core module, BG can access the global variables of the Standard module, and PCRE_G can access the global variables of the PCRE module.
#define PG(v) (core_globals.v)
#define BG(v) (basic_globals.v)
2.2. How to determine what configuration a module requires?
What kind of INI configuration is required by the module is defined in each module. For example, for the Core module, there is the following configuration item definition:
PHP_INI_BEGIN()
...
STD_PHP_INI_ENTRY_EX("display_errors", "1", PHP_INI_ALL, OnUpdateDisplayErrors, display_errors, php_core_globals, core_globals, display_errors_mode)
STD_PHP_INI_BOOLEAN("enable_dl", "1", PHP_INI_SYSTEM, OnUpdateBool, enable_dl, php_core_globals, core_globals)
STD_PHP_INI_BOOLEAN("expose_php", "1", PHP_INI_SYSTEM, OnUpdateBool, expose_php, php_core_globals, core_globals)
STD_PHP_INI_BOOLEAN("safe_mode", "0", PHP_INI_SYSTEM, OnUpdateBool, safe_mode, php_core_globals, core_globals)
...
PHP_INI_END()
The above code can be found in the php-srcmainmain.c file at about 450+ lines. There are many macros involved, including ZEND_INI_BEGIN, ZEND_INI_END, PHP_INI_ENTRY_EX, STD_PHP_INI_BOOLEAN, etc. This article will not go into details one by one. Interested readers can analyze them by themselves.
After macro expansion of the above code, we get:
static const zend_ini_entry ini_entries[] = {
..
{ 0, PHP_INI_ALL, "display_errors",sizeof("display_errors"),OnUpdateDisplayErrors,(void *)XtOffsetOf(php_core_globals, display_errors), (void *)&core_globals, NULL, "1", sizeof("1") -1, NULL, 0, 0, 0, display_errors_mode },
{0, php_ini_system, "enable_dl", sizeof ("enable_dl"), onupdateBool, (void *) XTOFFSETOF (php_core_globals, enable_dl), (VOID *) & CORE_G. lobals, null, "1", sizeof ("1") -1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
{ 0, PHP_INI_SYSTEM, "expose_php", sizeof("expose_php"), OnUpdateBool, (void *)XtOffsetOf(php_core_globals, expose_php), (void *)&core_globals, NULL, "1", sizeof("1") -1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
{ 0, PHP_INI_SYSTEM, "safe_mode", sizeof("safe_mode"), OnUpdateBool, (void *)XtOffsetOf(php_core_globals, safe_mode), (void *)&core_globals, NULL, "0", sizeof("0") -1, NULL, 0, 0, 0, zend_ini_boolean_displayer_cb },
...
{ 0, 0, NULL, 0, NULL, NULL, NULL, NULL, NULL, 0, NULL, 0, 0, 0, NULL }
};
We see that the definition of a configuration item essentially defines an array of type zend_ini_entry. The specific meaning of the fields of the zend_ini_entry structure is:
struct _zend_ini_entry {
int module_number; // module id
int modifiable; // Range that can be modified, such as php.ini, ini_set
char *name;
uint name_length;
ZEND_INI_MH((*on_modify)); // Callback function, which will be called when a configuration item is registered or modified
void *mh_arg1; // Usually the offset of the configuration item field in XXX_G
void *mh_arg2; // usually XXX_G
void *mh_arg3; // Usually a reserved field, rarely used
Char *value;
uint value_length;
char *orig_value;
uint orig_value_length;
int orig_modifiable; // The original modifiable of the configuration item
int modified;
void (*displayer)(zend_ini_entry *ini_entry, int type);
};
2.3, apply the configuration to the module - REGISTER_INI_ENTRIES
REGISTER_INI_ENTRIES can often be seen in PHP_MINIT_FUNCTION of different extensions. REGISTER_INI_ENTRIES is mainly responsible for completing two things. First, filling the global space XXX_G of the module and synchronizing the value in configuration_hash to XXX_G. Secondly, it also generates EG(ini_directives).
REGISTER_INI_ENTRIES is also a macro. After expansion, it is actually the zend_register_ini_entries method. Let’s look specifically at the implementation of zend_register_ini_entries:
ZEND_API int zend_register_ini_entries(const zend_ini_entry *ini_entry, int module_number TSRMLS_DC) /* {{{ */
{
// ini_entry is an array of zend_ini_entry type, and p is a pointer to each item in the array
const zend_ini_entry *p = ini_entry;
zend_ini_entry *hashed_ini_entry;
zval default_value;
// EG(ini_directives) is registered_zend_ini_directives
HashTable *directives = registered_zend_ini_directives;
zend_bool config_directive_success = 0;
// Remember that the last item in ini_entry is fixed to {0, 0, NULL, ...}
while (p->name) {
config_directive_success = 0;
// Add the zend_ini_entry pointed to by p to EG(ini_directives)
if (zend_hash_add(directives, p->name, p->name_length, (void*)p, sizeof(zend_ini_entry), (void **) &hashed_ini_entry) == FAILURE) {
zend_unregister_ini_entries(module_number TSRMLS_CC);
return FAILURE;
}
hashed_ini_entry->module_number = module_number;
// Query configuration_hash based on name, and put the result in default_value
// Note that the value of default_value is relatively primitive, usually numbers, strings, arrays, etc., depending on the writing method in php.ini
if ((zend_get_configuration_directive(p->name, p->name_length, &default_value)) == SUCCESS) {
// Call on_modify to update to the module’s global space XXX_G
if (!hashed_ini_entry->on_modify || hashed_ini_entry->on_modify(hashed_ini_entry, Z_STRVAL(default_value), Z_STRLEN(default_value), hashed_ini_entry->mh_arg1, hashed_ini_entry->mh _arg2, hashed_ini_entry->mh_arg3, ZEND_INI_STAGE_STARTUP TSRMLS_CC) == SUCCESS) {
hashed_ini_entry->value = Z_STRVAL(default_value);
hashed_ini_entry->value_length = Z_STRLEN(default_value);
config_directive_success = 1;
}
}
// If not found in configuration_hash, use the default value
if (!config_directive_success && hashed_ini_entry->on_modify) {
->mh_arg3, ZEND_INI_STAGE_STARTUP TSRMLS_CC);
}
p++;
}
return SUCCESS;
}
To put it simply, the logic of the above code can be expressed as:
1. Add the ini configuration items declared by the module to EG (ini_directives). Note that the value of the ini configuration item may be modified later.
2. Try to find the ini required by each module in configuration_hash.
If it can be found, it means that the value is configured in the user's ini file, and the user's configuration is used.
If it is not found, OK, it doesn’t matter, because the module will bring the default value when declaring ini.
3. Synchronize the value of ini to XX_G. After all, during the execution of php, these XXX_globals still play a role. The specific process is to call the on_modify method corresponding to each ini configuration. on_modify is specified by the module when declaring the ini.
Let’s take a closer look at on_modify, which is actually a function pointer. Let’s look at the configuration statements of two specific Core modules:
STD_PHP_INI_BOOLEAN("log_errors", "0", PHP_INI_ALL, OnUpdateBool, log_errors, php_core_globals, core_globals)
STD_PHP_INI_ENTRY("log_errors_max_len","1024", PHP_INI_ALL, OnUpdateLong, log_errors_max_len, php_core_globals, core_globals)
For log_errors, its on_modify is set to OnUpdateBool, and for log_errors_max_len, its on_modify is set to OnUpdateLong.
Further assume that our configuration in php.ini is:
log_errors = On
log_errors_max_len = 1024
Let’s take a closer look at the OnUpdateBool function:
ZEND_API ZEND_INI_MH(OnUpdateBool)
{
zend_bool *p;
// base represents the address of core_globals
char *base = (char *) mh_arg2;
// p represents the address of core_globals plus the offset of the log_errors field
//The obtained address is the address of the log_errors field
p = (zend_bool *) (base+(size_t) mh_arg1);
if (new_value_length == 2 && strcasecmp("on", new_value) == 0) {
*p = (zend_bool) 1;
}
else if (new_value_length == 3 && strcasecmp("yes", new_value) == 0) {
*p = (zend_bool) 1;
}
else if (new_value_length == 4 && strcasecmp("true", new_value) == 0) {
*p = (zend_bool) 1;
}
else {
// The value stored in configuration_hash is the string "1", not "On"
// So use atoi to convert it into the number 1
*p = (zend_bool) atoi(new_value);
}
return SUCCESS;
}
The most puzzling ones are probably mh_arg1 and mh_arg2. In fact, compared with the definition of zend_ini_entry mentioned above, mh_arg1 and mh_arg2 are still easy to understand. mh_arg1 represents the byte offset, mh_arg2 represents the address of XXX_globals. Therefore, the result of (char *)mh_arg2 + mh_arg1 is the address of a field in XXX_globals. Specifically in this case, it is to calculate the address of log_errors in core_globals. Therefore, when OnUpdateBool finally executes to
*p = (zend_bool) atoi(new_value);
Its function is equivalent to
core_globals.log_errors = (zend_bool) atoi("1");
After analyzing OnUpdateBool, let’s look at OnUpdateLong and it will be clear at a glance:
ZEND_API ZEND_INI_MH(OnUpdateLong)
{
long *p;
char *base = (char *) mh_arg2;
// Get the address of log_errors_max_len
p = (long *) (base+(size_t) mh_arg1);
// Convert "1024" into long type and assign it to core_globals.log_errors_max_len
*p = zend_atol(new_value, new_value_length);
return SUCCESS;
}
The last thing to note is that in the zend_register_ini_entries function, if there is a configuration in the configuration_hash, the value and value_length in the hashed_ini_entry will be updated when on_modify is called. In other words, if the user has configured it in php.ini, EG (ini_directives) stores the actual configured value. If the user is not configured, EG (ini_directives) stores the default value given when declaring zend_ini_entry.
The default_value variable in zend_register_ini_entries is poorly named and can easily cause misunderstanding. In fact, default_value does not represent the default value, but the value actually configured by the user.
3, Summary
At this point, the three pieces of data configuration_hash, EG (ini_directives) and PG, BG, PCRE_G, JSON_G, XXX_G... have all been explained clearly.
To summarize:
1, configuration_hash, stores the configuration in the php.ini file, does not perform verification, and its value is a string.
2. EG (ini_directives) stores the zend_ini_entry defined in each module. If the user has configured it in php.ini (existing in configuration_hash), the value is replaced by the value in configuration_hash, and the type is still a string.
3, XXX_G, this macro is used to access the global space of the module. This memory space can be used to store ini configuration and be updated through the function specified by on_modify. Its data type is determined by the field declaration in XXX_G.
http://www.bkjia.com/PHPjc/892830.htmlwww.bkjia.comtruehttp: //www.bkjia.com/PHPjc/892830.htmlTechArticleIn-depth understanding of ini configuration in php (1) This article will not describe in detail the purpose of a certain ini configuration item , these have been explained in detail in the manual. I just want to dig it from a certain angle...