How to Match Multi-Line Text Blocks with Regular Expressions in Python?-Python Tutorial-php.cn

How to Match Multi-Line Text Blocks with Regular Expressions in Python?

Mary-Kate Olsen

Release： 2024-10-25 10:25:17

Original

745 people have browsed it

How to Match Multi-Line Text Blocks with Regular Expressions in Python?

Matching Multi-Line Text Blocks with Regular Expressions in Python

In Python, regex matching can be challenging when dealing with multi-line text. For example, consider the following text where "n" represents a newline:

some Varying TEXT

DSJFKDAFJKDAFJDSAKFJADSFLKDLAFKDSAF
[more of the above, ending with a newline]
[yep, there is a variable number of lines here]
[repeat the above a few hundred times].

Copy after login

The goal is to capture two elements:

"some Varying TEXT"
All lines of uppercase text starting two lines below the first element, as a single capture group (line breaks can be stripped out later).

Previous attempts using variations of the following regular expressions have been unsuccessful:

re.compile(r"^>(\w+)$$(\[.$]+)^$", re.MULTILINE)
re.compile(r"(^[^>]\[\w\s]+)$", re.MULTILINE|re.DOTALL)

Copy after login

Solution:

To match the multi-line text correctly, use the following regular expression:

re.compile(r"^(.+)\n((?:\n.+)+)", re.MULTILINE)

Copy after login

This pattern matches the following:

Group 1: "some Varying TEXT"
Group 2: All lines of uppercase text starting two lines below "some Varying TEXT"

Key Points:

^ and $ anchors match positions immediately after and before newlines, respectively.
The ?: operator makes the newline group non-capturing.
The .* quantifier captures one or more lines of uppercase text.

Alternative Solution:

If the target text may contain other types of newlines besides linefeeds (n), use the following more inclusive version:

re.compile(r"^(.+)(?:\n|\r\n?)((?:(?:\n|\r\n?).+)+)", re.MULTILINE)

Copy after login

The above is the detailed content of How to Match Multi-Line Text Blocks with Regular Expressions in Python?. For more information, please follow other related articles on the PHP Chinese website!