python - 求优化一个查找有序list的相同子串的算法
PHP中文网
PHP中文网 2017-04-18 09:12:51
0
2
423

输入:有序的list
要求:查找相同的子串,且相同子串必须是连续的,例如[1,2,1,2]就是1,2子串为相同的,但[1,2,3,1,2]就不是。
输出:[(字串,次数), ...]

现在的问题是这两重循环导致了算法为O(n^2),效率过于低下了,求优化~~~

from collections import defaultdict

def get_patterns(values):
    start = 0
    total_length = len(values)
    patterns = defaultdict(lambda : 1)
    while start < total_length:
        next_start = start + 2
        while next_start < total_length:
            ori_values = values[start:next_start]
            if ori_values == values[next_start:(next_start+len(ori_values))]:
                patterns[tuple(ori_values)] += 1
                next_start += len(ori_values)
            else:
                next_start += 1
        start += 1
    return list(patterns.iteritems())

if __name__ == '__main__':
    print get_patterns([1,2,3,5,1,2])
    print get_patterns([1,2,1,2,3,5])
    print get_patterns([1,2,1,2,1,3,5])

运行结果:

[]
[((1, 2), 2)]
[((1, 2), 2), ((2, 1), 2)]
PHP中文网
PHP中文网

认证0级讲师

reply all(2)
迷茫

It’s done. After using numpy, the performance is so good. Now there are no performance problems...

巴扎黑

If you want to output all specific substrings, the efficiency is very slow.

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template