Keep specific html tags when splitting string

Question

I need to split a string by a specific number of tags (

,...). I came up with the regex pattern=

|

P粉787806024 · Answer

To answer your specific questions:

[^

And match instead of split.

\1 refers to what is captured in the opening tag.

Similar to:

for match in re.finditer(r"[^", subject, re.DOTALL):

However, in most real cases this is not sufficient to handle HTML and you should consider a DOM parser.