Blogger Information
Blog 41
fans 0
comment 1
visits 40511
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template
Python高效编程技巧实战(3)
yeyiluLAMP
Original
485 people have browsed it

开发环境:以Python2.x为主  IPython


第三天 : 03day

如何统计序列元素的出现频度

解决方案:使用collections.Counter对象
将序列传入Counter的构造器,得到Counter对象是元素频度的字典
Counter.most_common(n)方法得到频度最高的n个元素的列表

1.某随机序列[12,5,6,6,5,5,7...]    中找到出现次数最高的3个元素,它们出现次数是多少?
In [41]: from random import randint

In [42]: data = [randint(0,20) for _ in xrange(30)]

In [43]: data
Out[43]:
[18,
 15,
 2,
 2,
 15,
 7,
 6,
 0,
 1,
 8,
 15,
 9,
 15,
 8,
 19,
 14,
 6,
 17,
 8,
 1,
 8,
 15,
 2,
 3,
 2,
 13,
 0,
 19,
 6,
 4]

In [44]: c = dict.fromkeys(data,0)

In [45]: c
Out[45]:
{0: 0,
 1: 0,
 2: 0,
 3: 0,
 4: 0,
 6: 0,
 7: 0,
 8: 0,
 9: 0,
 13: 0,
 14: 0,
 15: 0,
 17: 0,
 18: 0,
 19: 0}

In [46]: for x in data:
   ....:     c[x] += 1
   ....:

In [47]: c
Out[47]:
{0: 2,
 1: 2,
 2: 4,
 3: 1,
 4: 1,
 6: 3,
 7: 1,
 8: 4,
 9: 1,
 13: 1,
 14: 1,
 15: 5,
 17: 1,
 18: 1,
 19: 2}


In [48]: from collections import Counter

In [49]: c2 = Counter(data)

In [50]: c2
Out[50]: Counter({15: 5, 2: 4, 8: 4, 6: 3, 0: 2, 1: 2, 19: 2, 3: 1, 4: 1, 7: 1, 9: 1, 13: 1, 14: 1, 17: 1, 18: 1})

In [51]: c2[15]
Out[51]: 5

In [52]: c2[2]
Out[52]: 4


In [53]: c2.most_common(3)
Out[53]: [(15, 5), (2, 4), (8, 4)]

2.对某英文文章的单词,进行词频统计,找到出现次数最多的10个单词,它们出现的次数是多少?
import re
txt = open('/etc/passwd').read()
c3 = re.split('\W+',txt)
c4 = Counter(c3)
print c4.most_common(10)







Statement of this Website
The copyright of this blog article belongs to the blogger. Please specify the address when reprinting! If there is any infringement or violation of the law, please contact admin@php.cn Report processing!
All comments Speak rationally on civilized internet, please comply with News Comment Service Agreement
0 comments
Author's latest blog post