This article will take you to understand the JIT in PHP 8, and talk about how JIT participates in the interpretation process. I hope it will be helpful to everyone!
PHP 8's JIT (Just In Time) compiler will be integrated into PHP as an extension. The Opcache extension is used to convert certain opcodes directly into cpu instructions at runtime. .
This means that after using JIT, Zend VM does not need to interpret certain opcodes, and these instructions will be executed directly as CPU-level instructions.
The impact of the PHP 8 Just In Time (JIT) compiler is unquestionable. But so far, I've found that very little is known about what JIT is supposed to do.
After much research and giving up, I decided to check the PHP source code myself. Combining some of my knowledge of the C language and all the scattered information I've gathered so far, I come up with this article, which I hope will help you understand PHP's JIT better.
To put it simply: When the JIT works as expected, your code is not executed through the Zend VM, but directly as a set of CPU-level instructions.
That's the whole idea.
But to understand it better we need to consider how php works internally. Not very complicated, but needs some introduction.
I wrote a blog post that gives a general overview of how php works. If you feel like this post is too much, just check another one and come back later. Things will become easier to understand.
As we all know, PHP is an interpreted language, but what does this sentence itself mean?
Every time PHP code (command line script or WEB application) is executed, it must go through the PHP interpreter. The most commonly used are the PHP-FPM and CLI interpreters.
The interpreter's job is simple: receive PHP code, interpret it, and return the result.
General interpreted languages follow this process. Some languages may eliminate a few steps, but the general idea is the same. In PHP, the process is as follows:
Reads the PHP code and interprets it as a set of keywords called Tokens. This process lets the interpreter know what code has been written in each program. This step is called Lexing or Tokenizing.
#After getting the Tokens collection, the PHP interpreter will try to parse them. An abstract syntax tree (AST) is generated through a process called Parsing. Here AST is a set of nodes representing what operations to perform. For example, "echo 1 1" actually means "print the result of 1 1" or more specifically "print an operation, this operation is 1 1".
With AST , it is easier to understand operations and priorities. Converting an abstract syntax tree into an operation that can be executed by the CPU requires a transition expression (IR), which in PHP we call Opcodes. The process of converting ASTs into Opcodes is called compilation .
With Opcodes, here comes the fun part: executing Code! PHP has an engine called Zend VM, which is able to receive a series of Opcodes and execute them. After all Opcodes are executed, Zend VM terminates the program.
This picture can make it clearer for you:
A simplified version of the PHP interpretation process overview.
As you can see. Here is a question: Even if the PHP code has not changed, will this process still be followed every time it is executed?
Let’s look back at Opcodes. correct! This is why the Opcache extension exists.
The Opcache extension comes with PHP and there is usually no need to disable it. When using PHP it is best to turn on Opcache.
Its function is to add a memory shared cache layer to Opcodes. Its job is to extract newly generated Opcodes from the AST and cache them so that the Lexing/Tokenizing and Parsing steps can be skipped during execution.
This is a process diagram that includes the Opcache extension:
PHP’s interpretation process using Opcache. If the file has already been parsed, PHP will get cached Opcodes for it instead of parsing it again.
Perfectly skipping the Lexing/Tokenizing, Parsing and Compiling steps?.
Side Note: This is the awesome PHP 7.4 Preloading Features RFC! Allows you to tell PHP FPM to parse the code base, convert it into Opcodes and cache it before executing it.
Do you want to know how JIT participates in this interpretation process? This article will explain.
After listening to Zeev's PHP and JIT broadcast on PHP Internals News, I figured out what JIT actually does.
If the Opcache extension can get Opcodes faster and transfer them directly to the Zend VM, the JIT allows them to run without using the Zend VM at all.
Zend VM is a program written in C that acts as a layer between Opcodes and the CPU. JIT generates compiled code directly at runtime, so PHP can skip the Zend VM and be executed directly by the CPU. In theory, the performance will be better.
This sounds strange because a specific implementation needs to be written for each type of structure before it can be compiled into machine code. But in fact this is reasonable.
PHP’s JIT uses a library called DynaASM (Dynamic Assembler), which maps a set of CPU instructions in a specific format into assembly code for many different CPU types. Therefore, the compiler only needs to use DynASM to convert Opcodes into machine code for a specific structure.
However, there is a problem that has troubled me for a long time.
If preloading can parse PHP code into Opcodes before execution, and DynASM can compile Opcodes into machine code (Just In Time compilation), why don't we use Ahead of Time compilation) What about compiling PHP immediately?
One of the reasons I found out by listening to Zeev's broadcast is that PHP is a weakly typed language, which means that PHP often doesn't know the type of a variable until the Zend VM tries to execute an opcode.
You can check the Zend_value union type to learn that many pointers point to variables of different types. Whenever Zend VM tries to get a value from a Zend_value, it uses macros like ZSTR_VAL to get a pointer to a string in the union type.
For example, this Zend VM handler handles "less than or equal to" (
Using machine code to perform type inference logic is not feasible and may become slower.
Evaluating first and then compiling is also not a good choice, because compiling to machine code is a CPU-intensive task. So compiling everything at runtime is also not good.
Now we know that types cannot be well inferred to compile ahead of time. We also know that compilation at runtime is computationally expensive. So what are the benefits of JIT for PHP?
In order to seek balance, PHP's JIT tries to compile only valuable Opcodes. To do this, the JIT analyzes the Opcodes that the Zend VM is going to execute and checks for possible compilations. (According to the configuration file)
When an Opcode is compiled, it will hand execution to the compiled code instead of to Zend VM. It looks like this:
#JIT interpretation process for PHP. If compiled, Opcodes are not executed by Zend VM.
Therefore, in the Opcache extension, there are two detection instructions to determine whether to compile Opcode. If so, the compiler will use DynASM to convert this Opcode to machine code and execute this machine code.
Interestingly, since the code compiled in the current interface has a MB limit (also configurable), code execution must be able to switch seamlessly between JIT and interpreted code.
By the way, this talk by Benoit Jacquemont on JIT in php helped me understand this whole thing.
I'm still not sure when the compilation part was effectively done, but I guess right now I really don't want to know.
I hope it's now clear to everyone why most php applications don't get big performance gains from using a just-in-time compiler. This is why Zeev recommends that profiling and experimenting with different JIT configurations for your application is the best approach.
If you're using PHP FPM, you'll typically share compiled opcodes across multiple requests, but that's still not a game changer.
This is because JIT optimizes computationally intensive operations and most php applications today are more I/O bound than anything else. If you are accessing disk or network anyway, then handle It doesn't matter whether the operation is compiled or not. The timing will be very similar.
Unless...
You are doing something that is not I/O bound, like image processing or machine learning. Anything that doesn't touch I/O will benefit from a JIT compiler.
This is why people now say that we prefer to write native functions in PHP instead of writing in C. If this function were to be compiled anyway, the overhead would be unexpressive.
Fun times becoming a PHP programmer...
I hope this article will be helpful to you and enable you to better understand the JIT of PHP8.
Original address: https://thephp.website/en/issue/php-8-jit/
Recommended: "PHP Video Tutorial》
The above is the detailed content of An in-depth look at JIT in PHP 8. For more information, please follow other related articles on the PHP Chinese website!