<!DOCTYPE html>
<html>
<head>
<title>内容</title>
</head>
<body>
中文
<p>内容<i>内容</i></p>
</body>
</html>
Replace the content in the tag with
<!DOCTYPE html>
<html>
<head>
<title>{{#内容#}}</title>
</head>
<body>
{{#中文#}}
<p>{{#内容#}}<i>{{#内容#}}</i></p>
</body>
</html>
How to write the regular expression?
First, if you have learned the principles of compilation, you will know that regular expressions are not capable of processing nested data structures. In other words, if you want to achieve the requirement of [selecting the first i tag of nested p in body] through regular expression, it is not feasible in principle.
Second, you are dealing with a structured DOM text, so you can use jQuery’s parseHTML method to achieve this. The object obtained after passing jQuery parse can use $ to further select nodes such as p or i. This can solve your text replacement problem simply and effectively.
If you are on the Node server, just replace jQuery with cheerio.
If you only want to replace in the current test text you gave, because the situation is relatively simple, you only need to write a regular match like ([u4e00-u9fa5]+) and replace it with {{#$1#}}