PHP is a weakly typed language. Such features inevitably require seamless and transparent implicit type conversion , PHP uses zval internally to save any type of value. The structure of zval is as follows (5.2 as an example):
Copy code The code is as follows:
struct _zval_struct {
/* Variable information */
zvalue_value value; /* value */
zend_uint refcount;
zend_uchar type; /* active type */
zend_uchar is_ref;
};
In the above structure, what actually saves the value itself is the zvalue_value union:
Copy code The code is as follows:
typedef union _zvalue_value {
long lval; /* long value */
double dval; /* double value */
struct {
char *val;
int len;
} str;
HashTable *ht; /* hash table value */
zend_object_value obj;
} zvalue_value;
Today’s topic, we only focus on two of them, lval and dval. We must realize that long lval has an uncertain length depending on the compiler and OS word length. It may be 32bits or 64bits, and double dval (double precision) is specified by IEEE 754 and is of fixed length, which must be 64bits.
Please keep this in mind, which makes some PHP code "non-platform independent". Our following discussion, unless otherwise specified, assumes that long is 64bits
I won’t quote the floating point counting method of IEEE 754 here. If you are interested, you can check it out for yourself. The key point is that the mantissa of double is stored in 52 bits, including the hidden 1 significant bit. A total of 53 bits.
Here, a very interesting question arises. Let’s use c code as an example (assuming long is 64bits):
Copy code The code is as follows:
long a = x;
assert(a == (long)(double)a);
Excuse me, when the value of a is within what range, can the above code be asserted to be successful? (Leave the answer at the end of the article)
Now let’s get back to the topic. Before executing a script, PHP first needs to read the script and analyze the script. This process also includes zvalizing the literals in the script. For example, for the following script:
Copy code The code is as follows:
$a = 9223372036854775807; //Maximum value of 64-bit signed number
$b = 9223372036854775808; //Maximum value 1
var_dump($a);
var_dump($b);
Output:
Copy code The code is as follows:
int(9223372036854775807)
float(9.22337203685E 18)
In other words, during the lexical analysis stage, PHP will judge whether a literal value exceeds the long table value range of the current system. If not, lval will be used to save it, and zval will be IS_LONG. Otherwise, dval will be used. means, zval IS_FLOAT.
We must be careful with any value larger than the largest integer value, because it may cause loss of accuracy:
Copy code The code is as follows:
$a = 9223372036854775807;
$b = 9223372036854775808;
var_dump($a === ($b - 1));
The output is false.
Now continuing the discussion at the beginning, as mentioned before, PHP's integers may be 32-bit or 64-bit, so it is decided that some codes that can run normally on 64-bit may fail due to invisible Type conversion causes precision loss, causing the code to not run properly on 32-bit systems.
So, we must be wary of this critical value. Fortunately, this critical value has been defined in PHP:
Copy code The code is as follows:
echo PHP_INT_MAX;
?>
Of course, to be on the safe side, we should use strings to store large integers, and use mathematical function libraries such as bcmath to perform calculations.
In addition, there is another key configuration that will confuse us. This configuration is php.precision. This configuration determines how many significant digits PHP outputs when it outputs a float value.
Finally, let’s look back at the question raised above, that is, what is the maximum value of a long integer to ensure that there will be no loss of precision after converting to float and then back to long?
For example, for an integer, we know that its binary representation is, 101. Now, let us right shift two bits to become 1.01, discard the implicit significant bit 1 of the high bit, and we get the binary representation of 5 stored in double The value is:
Copy code The code is as follows:
0/*Sign bit*/ 10000000001/*Exponent bit*/ 010000000000000000000000000000000000000000000000000
The binary representation of 5 is stored in the mantissa part without any loss. In this case, there will be no loss of precision when converting from double back to long.
We know that double uses 52 bits to represent the mantissa. Counting the implicit first 1, the total is 53 bits of precision. Then it can be concluded that if a long integer, the value is less than:
Copy code The code is as follows:
2^53 - 1 == 9007199254740991; //Keep in mind, we now assume that it is a 64bits long
Then, this integer will not lose precision when the long->double->long value conversion occurs.