For those who are not following POJ (Pascal on the JVM) it is a compiler that transforms a subset from Pascal to JASM (Java Assembly) so that we can use the JVM as an execution environment.
In the last post we resolved some important bugs, especially in the generation of the assembly. In this post we will talk about how to correctly generate the assembly for nested sentences.
As we are compiling for the JVM, it is necessary to detail the functioning of various points of this incredible virtual machine. Therefore, at various times I detail the internal functioning of the JVM as well as some of its instructions (opcodes).
One of the necessary features to correctly deal with nested sentences is the possibility of having multiple contexts in the parser. This is because if the parser assumes one context only, nested control sentences that generate labels and jumps (like if, for, while and repeat) would generate jump addressing incorrectly.
There are two ways to deal with contexts, which are:
Usually I would use the recursive parser approach. However, in order to implement a recursive parser with ANTLR, due to the way the grammar was structured in POJ, it would be necessary to inject code directly into the grammar, an approach that is not recommended. As a result, we opted for the approach of stacking contexts.
As there was already a stack implementation that monitored the types that the parser stacked/unstacked in the JVM, in order to not have to create another stack for a specific type, we decided to create one stack generic in this PR. In addition to being able to take advantage of this implementation later, I can still refactor the old code and remove the existing specific stack.
In this commit the parser was changed to correctly stack/unstack function contexts. Basically at the beginning of the parser of a function the context is stacked, and at its end the context is unstacked.
Control sentences like if, for, while and repeat worked correctly. However, if there were nested sentences, POJ did not store context and ended up erroneously generating labels and jumps. Here and here it was discussed how the generation of the assembly works for these control sentences.
For the example below, which contains an if nested inside another, POJ erroneously generated the labels and necessary jumps:
program NestedIfs; begin if (1 > 2) then if (2 > 3 ) then writeln('1 > 2 and 2 > 3') else writeln('1 > 2 and 2 <= 3') else writeln('1 <= 2'); end.
This bug was known and I chose to resolve it when the parser supported contexts.
In this commit, the LabelsContext structure was created containing the following labels:
To validate the correct generation of the assembly, tests were created to validate nested if's, nested repeat's, nested while's as well as for's nested. Tests were created here to validate the generation of the assembly in the case of recursive functions. Furthermore, it was necessary to update the expected assembly of all existing tests. Finally, in this PR, parser was updated to use the new context structure.
Here is the complete PR of these changes. Here we have the commit containing the changes for the correct functioning of the if sentence, here the commit referring to the repeat, here the commit referring to while and here the commit referring to for.
In the next post we will talk about data entry. Now it's not long before we complete one of the objectives of this project: reading a number from standard input and calculating its factorial.
The repository with the project's complete code and documentation is here.
The above is the detailed content of Nested sentences. For more information, please follow other related articles on the PHP Chinese website!