Python Regular Expression Confusion: Substitutions with Group Numbered Backreferences
When attempting to replace "foobar" with "foo123bar" using a regular expression, you may encounter unexpected results. A replacement like re.sub(r'(foo)', r'1123', 'foobar') fails to produce the desired output and instead returns "J3bar."
To understand the issue, it's important to note the distinction between group number backreferences and literal digits. In this case, 1123 treats "1123" as a literal string, not as a reference to the first capture group. As a result, the replacement doesn't work as intended.
To achieve the correct substitution, you should use the syntax g
re.sub(r'(foo)', r'\g<1>123', 'foobar')
In this case, g<1> captures the substring matched by the first group, which is the string "foo" from the input. By using the appropriate group number backreference, the replacement is performed correctly, resulting in the output "foo123bar."
This behavior is explained in the Python documentation, which describes the use of g
The above is the detailed content of Why does `re.sub(r\'(foo)\', r\'\\1123\', \'foobar\')` not produce \'foo123bar\' in Python regular expressions?. For more information, please follow other related articles on the PHP Chinese website!