我們如何有效地計算 Python 中重疊子字串的出現次數？-Python教學-PHP中文網

我們如何有效地計算 Python 中重疊子字串的出現次數？

Patricia Arquette

發布： 2024-12-15 11:27:16

原創

521 人瀏覽過

How Can We Efficiently Count Overlapping Substring Occurrences in Python?

有效計數重疊字串的出現次數

辨識字串中子字串的出現次數可能很棘手，特別是在允許重疊的情況下。像 Python 的 string 這樣的函式庫為此目的提供了「count」等內建方法，但它們不考慮重疊實例。

重疊字元計數

考慮以下方法:

def overlapping_count(string, substring):
    count = 0
    for i in range(len(string) - len(substring) + 1):
        if string[i:i+len(substring)] == substring:
            count += 1
    return count

登入後複製

這裡，函數迭代字串，檢查指定的子字串長度並在找到匹配時增加計數。此方法很簡單，但對於大字串可能相對較慢。

潛在的最佳化

出於性能原因，值得探索一種涉及利用Cython 功能的不同方法：

import cython

@cython.boundscheck(False)
def faster_occurrences(string, substring):
    cdef int count = 0
    cdef int start = 0
    while True:
        start = string.find(substring, start) + 1
        if start > 0:
            count += 1
        else:
            return count

登入後複製

使用Cython，我們可以利用靜態類型宣告和即時(JIT) 編譯透過跳過Python 程式碼不必要的類型檢查和最佳化來提高效能。對於更大的資料集，這個優化的函數應該會明顯更快。

以上是我們如何有效地計算 Python 中重疊子字串的出現次數？的詳細內容。更多資訊請關注PHP中文網其他相關文章！