How to Extract Matches Between Two Strings in Logs with a Regex?

Mary-Kate Olsen
Release: 2024-10-23 22:17:02
Original
824 people have browsed it

How to Extract Matches Between Two Strings in Logs with a Regex?

Regex to Extract Matches Between Two Strings

Given a large log file containing multi-line strings enclosed by specific start and end markers, the goal is to extract and print only the shortest such strings. However, the start marker is used elsewhere in the file, so a simple regex will not suffice.

To address this, we can employ the following regular expression:

(start((?!start).)*?end)
Copy after login

This regex matches strings that:

  • Begin with "start" followed by characters that do not contain "start".
  • End with "end".

Using Python's re.findall method with the single-line modifier (re.S), we can retrieve all such strings from the input file:

<code class="python">import re

text = """
start spam
start rubbish
start wait for it...
    profit!
here end
start garbage
start second match
win. end
"""

matches = re.findall('(start((?!start).)*?end)', text, re.S)
print(matches)</code>
Copy after login

This will output the desired result:

['start wait for it...
    profit!
here end', 'start second match
win. end']
Copy after login

The above is the detailed content of How to Extract Matches Between Two Strings in Logs with a Regex?. For more information, please follow other related articles on the PHP Chinese website!

source:php
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
About us Disclaimer Sitemap
php.cn:Public welfare online PHP training,Help PHP learners grow quickly!