Overlapping Regex Matches: Finding Every Series of Numbers
Introduction
The question arises of how to locate every overlapping 10-digit number series within a larger collection of numbers using the 're' module in Python 2.6. While non-overlapping matches are straightforward, extracting all occurrences proves challenging.
Solution
To address this problem, a capturing group can be employed within a lookahead. The lookahead identifies the desired sequences, but the actual match corresponds to the zero-width substring preceding it, ensuring non-overlapping matches.
Implementation
import re s = "123456789123456789" matches = re.finditer(r'(?=(\d{10}))', s) results = [int(match.group(1)) for match in matches]
Output
[1234567891, 2345678912, 3456789123, 4567891234, 5678912345, 6789123456, 7891234567, 8912345678, 9123456789]
Explanation
The regular expression (?=(d{10})) asserts that immediately to its right is a capturing group representing a 10-digit numerical series (d{10}). The lookahead does not consume any characters, so the actual matches are the zero-length substrings preceding the lookaheads, which are the individual 10-digit series.
The above is the detailed content of How Can I Find All Overlapping 10-Digit Number Series in a String Using Python's `re` Module?. For more information, please follow other related articles on the PHP Chinese website!