The C code in question, as provided by Bjarne Stroustrup in the 4th edition of "The C Programming Language," employs function chaining to modify a string:
<code class="cpp">void f2() { std::string s = "but I have heard it works even if you don't believe in it"; s.replace(0, 4, "").replace(s.find("even"), 4, "only").replace(s.find(" don't"), 6, ""); assert(s == "I have heard it works only if you believe in it"); }</code>
This code demonstrates the chaining of replace() operations to alter the string s. However, it has been observed that this code exhibits different behavior across various compilers, such as GCC, Visual Studio, and Clang.
While the code may appear straightforward, it involves unspecified order of evaluation, particularly for sub-expressions that involve function calls. Although it does not invoke undefined behavior (since all side effects occur within function calls), it does exhibit unspecified behavior.
The key issue is that the order of evaluation of sub-expressions, such as s.find("even") and s.find(" don't"), is not explicitly defined. These sub-expressions can be evaluated either before or after the initial s.replace(0, 4, "") call, which can impact the result.
If we examine the order of evaluation for the code snippet:
s.replace(0, 4, "").replace(s.find("even"), 4, "only").replace(s.find(" don't"), 6, "");
We can see that the following sub-expressions are indeterminately sequenced (indicated by the numbers in parentheses):
The expressions within each pair of parentheses are ordered (e.g., 2 precedes 3), but they can be evaluated in different orders relative to each other. Specifically, the indeterminacy lies between expressions 1 and 2, as well as between 1 and 4.
The observed discrepancies in compiler behavior can be attributed to the different evaluation orders chosen by each compiler. In some cases, the replace() calls are evaluated in a way that results in the expected behavior, while in other cases, the evaluation order alters the string in an unexpected way.
To illustrate, consider the following:
It's important to note that this code does not invoke undefined behavior. Undefined behavior typically involves accessing uninitialized variables or attempting to access memory outside of its bounds. In this case, all side effects occur within function calls, and the code does not access invalid memory locations.
However, the code does exhibit unspecified behavior, which means that the exact order of evaluation of sub-expressions is not defined by the C standard. This can lead to different results across different compilers or even different runs of the same program.
The C standard committee has recognized this issue and proposed changes to refine the expression evaluation order for idiomatic C . Proposed changes to [expr.call]p5 in C 20 specify that "the postfix-expression is sequenced before each expression in the expression-list and any default argument," which would eliminate the unspecified behavior in this code.
The above is the detailed content of Does Function Chaining in 'The C Programming Language' Exhibit Unspecified Behavior?. For more information, please follow other related articles on the PHP Chinese website!