Parsing C with LR(1) Parsers: An Impossible Endeavor
LR parsing is a powerful technique for parsing context-free grammars. However, as noted by Wikipedia, C poses a significant challenge to LR parsers. This article delves into the specific features of C that make it unsuitable for parsing with an LR(1) parser.
Ambiguous Grammar Rules
LR parsers are not designed to handle ambiguous grammar rules. Yet, C contains an infamous statement that introduces ambiguity:
x * y ;
This statement can be interpreted as either a declaration of y as a pointer to type x or a multiplication of x and y, discarding the result.
LR Parsers' Inherent Limitations
LR(1) parsers can only look ahead at a limited number of tokens, typically one or two. This limitation prevents them from resolving the ambiguity in the aforementioned statement. They would need to see information that comes later in the code to make a definitive determination.
Alternative Approaches
Compilers that successfully parse C use various strategies to overcome the limitations of pure LR parsers. These techniques include:
Conclusion
The ambiguous nature of certain grammar rules in C makes it impossible to parse the language reliably using an LR(1) parser. Compilers that successfully parse C employ alternative techniques to address the specific challenges posed by the language.
The above is the detailed content of Why Can't LR(1) Parsers Handle C ?. For more information, please follow other related articles on the PHP Chinese website!