In order to deeply understand how formal verification technology is applied to zkVM (zero-knowledge virtual machine), this article will focus on the verification of a single instruction. For the overall situation of advanced formal verification of ZKP (zero-knowledge proof), please refer to the article "Advanced Formal Verification of Zero-Knowledge Proof Blockchain" that we published at the same time.
zkVM (zero-knowledge virtual machine) can create short proof objects as evidence that a specific program can run on certain inputs and terminate successfully. In the Web3.0 field, the application of zkVM enables high throughput because the L1 node only needs to verify a short proof of the smart contract's transition from input state to output state, and the actual contract code execution can be completed off-chain.
The zkVM prover will first run the program to generate an execution record of each step, and then convert the execution record data into a set of numerical tables (this process is called "arithmeticization"). A set of constraints (i.e., circuits) must be satisfied between these numbers, which includes the connection equations between specific table cells, fixed constants, database search constraints between tables, and the constraints that need to be satisfied between each pair of adjacent table rows. Polynomial equations (aka "gates"). On-chain verification can confirm that there is indeed a table that satisfies all constraints, while ensuring that the specific numbers in the table will not be seen.
The execution of each VM instruction faces many such constraints. Here we refer to this set of constraints of the VM instruction as its "ZK instruction". Below is an example of a zkWasm memory load instruction constraint written in Rust.
#Formal verification of ZK instructions is done by performing formal reasoning on these codes, which we first translate into a formal language.
#If even a single constraint contains an error, it is possible for an attacker to submit a malicious ZK proof. The data table corresponding to the malicious proof does not correspond to the legal operation of the smart contract. Unlike non-ZK chains such as Ethereum, which have many nodes running different EVM (Ethereum Virtual Machine) implementations, making it unlikely that the same bug will occur in the same place at the same time, a zkVM chain only has a single VM implementation. From this perspective alone, the ZK chain is more fragile than the traditional Web3.0 system.
What’s worse is that unlike non-ZK chains, since the calculation details of zkVM transactions are not submitted and stored on the chain, after the attack occurs, it is not only very difficult to discover the specific details of the attack, but also Even identifying the attacks themselves can become challenging.
The zkVM system requires extremely strict inspection. Unfortunately, the correctness of the zkVM circuit is difficult to guarantee.
VM (Virtual Machine) is one of the most complex parts of the Web3.0 system architecture. The powerful functions of smart contracts are the core of supporting Web3.0 capabilities. Their power comes from the underlying VMs, which are both flexible and complex: in order to complete general computing and storage tasks, these VMs must be able to support numerous instructions and states. For example, the EVM's geth implementation requires more than 7,500 lines of Go language code. Similarly, the ZK circuitry that constrains the execution of these instructions is equally large and complex. Like in the zkWasm project, the implementation of the ZK circuit requires more than 6000 lines of Rust code.
zkWasm circuit architecture
with ZK systems designed for specific applications such as private payments Compared with the dedicated ZK circuits used in Thousands of digital cells.
We want to verify the correctness of the XOR instruction in zkWasm here. Technically speaking, zkWasm's execution table corresponds to a legal Wasm VM execution sequence. So from a macro perspective, what we want to verify is that each execution of this instruction will always generate a new legal zkVM state: each row in the table corresponds to a legal state of the VM, and the following row always It is generated by executing the corresponding VM instructions. The figure below shows the formal theorem for the correctness of the XOR instruction.
Here "state_rel i st" indicates that state "st" is the legal zkVM state of the smart contract in step "i". As you might guess, proving "state_rel (i 1) ..." is not trivial.
Although the calculation semantics of the XOR instruction are very simple, which is to calculate the bitwise exclusive OR (bitwise Take two integers from memory, perform XOR calculation, and then store the new integer calculated back to the same stack. In addition, the execution steps of this instruction should be integrated into the entire smart contract execution process, and the order and timing of its execution must be correct.
Therefore, the execution effect of the XOR instruction should be to pop two numbers from the data stack, push their XOR results, and at the same time increment the program counter to point to the smart contract Next instruction.
It is not difficult to see that the correctness attribute definition here is generally the same as when we verify traditional bytecode VM (such as the EVM interpreter in the Ethereum L1 node) The situations faced are very similar. It relies on a high-level abstract definition of the machine state (here stack memory and execution flow) and a definition of the expected behavior of each instruction (here arithmetic logic).
However, as we will see below, due to the special nature of ZKP and zkVM, the verification process for their correctness is often very different from that of regular VMs. Just to verify our single instruction here depends on the correctness of many tables in zkWASM: among them is a range table used to limit the size of the value, a bit table used for the intermediate results of binary bit calculations, An execution table that stores constant-sized VM state per row (similar to data in registers and latches in a physical CPU), and memory that represents dynamically variable-sized VM state (memory, data stack, and call stack) tables and jump tables.
Similar to traditional VM, we need to ensure that the two integer parameters of the instruction can be read from the stack, and that their XOR result values are correctly written back stack. The formal representation of the stack also looks fairly familiar (there's also global memory and heap memory, but the XOR instruction doesn't use them).
#zkVM uses a complex scheme to represent dynamic data because the ZK prover does not natively support data structures like stacks or arrays. On the contrary, for each value pushed onto the stack, the memory table records a separate row, and some of the columns are used to indicate the effective time of the entry. Of course, the data in these tables can be controlled by attackers, so some constraints must be imposed to ensure that the table entries actually correspond to the actual stack instructions in contract execution. This is achieved by carefully calculating the number of stack pushes during program execution. When verifying each instruction, we need to ensure that this count is always correct. In addition, we have a series of lemmas that relate the constraints generated by a single instruction to the table lookups and time range checks that implement stack operations. From the top level, the counting constraints for memory operations are defined as follows.
Similar to traditional VM, we want to ensure that the bitwise XOR operation is correct. This seems easy, after all our physical computer CPUs are capable of doing this in one go.
But for zkVM, this is actually not simple. The only arithmetic instructions natively supported by the ZK prover are addition and multiplication. In order to perform binary bit operations, VM uses a rather complex scheme, in which one table stores the fixed byte-level operation results, and the other table acts as a "scratch pad" to show how to perform operations on multiple table rows. The 64-bit number is broken down into 8 bytes and then reassembled to give the result.
Fragment of the zkWasm bittable specification
Very simple XOR in a traditional programming language Operations, here many lemmas are needed to verify the correctness of these auxiliary tables. For our directive we have:
Similar to a traditional VM, we need to ensure that the value of the program counter is updated correctly. For sequential instructions like XOR, the program counter needs to be incremented by one after each step.
Since zkWasm is designed to run Wasm code, it is also important to ensure that the immutable nature of Wasm memory is maintained throughout execution.
Traditional programming languages have native support for data types such as Boolean values, 8-bit integers, and 64-bit integers. However, in the ZK circuit, variables are always within the range of integers modulo large prime numbers (≈ 2254). Variety. Since VMs typically run at 64 bits, circuit developers need to use a system of constraints to ensure they have the correct numerical ranges, and formal verification engineers need to track invariant properties about these ranges throughout the verification process. Reasoning about execution flows and execution tables involves all other auxiliary tables, so we need to check that the ranges of all table data are correct.
Similar to the case of memory operation number management, zkVM requires a similar set of lemmas to verify control flow. Specifically, each call and return instruction requires the use of a call stack. The call stack is implemented using a table scheme similar to the data stack. For the XOR instruction, we do not need to modify the call stack; but when verifying the entire instruction, we still need to track and verify whether the control flow operation count is correct.
Verify this instruction
Let's put all the steps together and verify the end-to-end correctness theorem of the zkWasm XOR instruction. The following verifications are completed in an interactive proof environment, where every formal construction and logical reasoning step undergoes the most rigorous machine checks.
#As seen above, formal verification of zkVM circuits is feasible. But this is a difficult task that requires understanding and tracking many complex invariant properties. This reflects the complexity of the software being verified: every lemma involved in the verification needs to be handled correctly by the circuit developer. Given the high stakes involved, it would be valuable to have them machine-checked by a formal verification system, rather than relying solely on careful human review.
The above is the detailed content of Advanced Formal Verification of Zero-Knowledge Proofs: How to Verify a ZK Instruction. For more information, please follow other related articles on the PHP Chinese website!