Tutorial on how to generate a word cloud using python

巴扎黑
Release: 2017-06-23 15:33:12
Original
3007 people have browsed it

I have been busy with the final review and have spent some time writing scrapy frameworks. Today I will introduce how to use python to generate word clouds. Although there are many word cloud generation tools on the Internet, it would be more fulfilling to write it yourself in python.

What we are going to generate today is a word cloud of inspirational songs. We have found about 20 songs in Baidu Library, such as "Stubborn", "The Sea and the Sky", etc. that everyone is familiar with.

The python libraries to be used include jieba (a Chinese word segmentation library), wordcould, matplotlib, PIL, and numpy.

The first thing we need to do is read the lyrics. I saved the lyrics in the inspirational song text in the file directory.

Now let’s read it

#encoding=gbklyric= ''f=open('./励志歌曲歌词.txt','r')for i in f:
    lyric+=f.read()
Copy after login

#encoding=gbk is added to prevent subsequent operations from reporting SyntaxError: Non-UTF-8 code starting with '\xc0'
Then we use jieba word segmentation to segment the songs and extract words with high frequency

import jieba.analyse
result=jieba.analyse.textrank(lyric,topK=50,withWeight=True)
keywords = dict()for i in result:
    keywords[i[0]]=i[1]print(keywords)
Copy after login

Get the result:

Then We can generate word clouds through libraries such as wrodcloud

First find a picture to use as the shape of the word cloud

from PIL import Image,ImageSequenceimport numpy as npimport matplotlib.pyplot as pltfrom wordcloud import WordCloud,ImageColorGenerator
image= Image.open('./tim.jpg')
graph = np.array(image)
wc = WordCloud(font_path='./fonts/simhei.ttf',background_color='White',max_words=50,mask=graph)
wc.generate_from_frequencies(keywords)
image_color = ImageColorGenerator(graph)
plt.imshow(wc)
plt.imshow(wc.recolor(color_func=image_color))
plt.axis("off")
plt.show()
Copy after login

Save the generated image

wc.to_file('dream.png')
Copy after login


Full code:

#encoding=gbkimport jieba.analysefrom PIL import Image,ImageSequenceimport numpy as npimport matplotlib.pyplot as pltfrom wordcloud import WordCloud,ImageColorGenerator
lyric= ''f=open('./励志歌曲歌词.txt','r')for i in f:
    lyric+=f.read()


result=jieba.analyse.textrank(lyric,topK=50,withWeight=True)
keywords = dict()for i in result:
    keywords[i[0]]=i[1]print(keywords)


image= Image.open('./tim.jpg')
graph = np.array(image)
wc = WordCloud(font_path='./fonts/simhei.ttf',background_color='White',max_words=50,mask=graph)
wc.generate_from_frequencies(keywords)
image_color = ImageColorGenerator(graph)
plt.imshow(wc)
plt.imshow(wc.recolor(color_func=image_color))
plt.axis("off")
plt.show()
wc.to_file('dream.png')
Copy after login

The above is the detailed content of Tutorial on how to generate a word cloud using python. For more information, please follow other related articles on the PHP Chinese website!

Related labels:
source:php.cn
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template