Python 中用正規表示式取代字串
問題:
輸入:
所需輸出:
this is a paragraph with<[1]> in between</[1]> and then there are cases ... where the<[99]> number ranges from 1-100</[99]>. and there are many other lines in the txt files with<[3]> such tags </[3]>
解決方案>
this is a paragraph with in between and then there are cases ... where the number ranges from 1-100. and there are many other lines in the txt files with such tags
使用正規表示式取代多個標籤Python,依照下列步驟操作:
說明:
import re line = re.sub(r"<\/?\[\d+>]", "", line)
正規表示式r" ?[d >"] 匹配以任何開頭的標籤 結尾。問號字元? / 後面表示斜線是可選的。 sub 函數將每個匹配項替換為空字串。
註解版本:附加註解:
line = re.sub(r""" (?x) # Use free-spacing mode. < # Match a literal '<' /? # Optionally match a '/' \[ # Match a literal '[' \d+ # Match one or more digits > # Match a literal '>' """, "", line)
正則表達式建議使用類似的工具www.regular-expressions.info 了解語法並測試您的表達式。
避免硬編碼要替換的數字範圍(從 1 到 99)。以上是如何使用 Python 正規表示式從字串中刪除 HTML 標籤?的詳細內容。更多資訊請關注PHP中文網其他相關文章!