Natural Language Processing Examples in Python: Sentiment Analysis
With the development of artificial intelligence, natural language processing (NLP) is receiving more and more attention in various fields. Among them, sentiment analysis is an important direction of NLP application. Sentiment analysis can be used to analyze users' emotional tendencies toward products, services, or events, helping companies better understand consumer needs and promote the formulation of marketing strategies. This article will introduce examples of sentiment analysis in Python.
To perform sentiment analysis in Python, you need to use the third-party library Natural Language Toolkit (NLTK) and TwitterAPI. You can use pip to install these two libraries:
pip install nltk pip install TwitterAPI
Before performing sentiment analysis, the text needs to be preprocessed. It can uniformly convert text into lowercase letters and remove irrelevant information such as punctuation marks, numbers, stop words, etc. The preprocessing code is as follows:
import re from nltk.corpus import stopwords def clean_text(text): text = text.lower() # 将文本转换成小写字母 text = re.sub(r'[^ws]', '', text) # 去除标点符号 text = re.sub(r'd+', '', text) # 去除数字 stop_words = set(stopwords.words('english')) words = text.split() words = [w for w in words if w not in stop_words] # 去除停用词 text = ' '.join(words) return text
Next, you need to build a sentiment analysis model. Since sentiment analysis is supervised learning (that is, it requires labeled data), building a model requires labeled training data. A movie review dataset from NLTK is used here, which contains 1000 reviews with positive or negative sentimental tendencies. These comments have been flagged.
import nltk from nltk.corpus import movie_reviews import random documents = [(list(movie_reviews.words(fileid)), category) for category in movie_reviews.categories() for fileid in movie_reviews.fileids(category)] random.shuffle(documents)
After obtaining the training data, you can build a Naive Bayes classifier by using the NaiveBayesClassifier in nltk. The code is as follows:
all_words = nltk.FreqDist(w.lower() for w in movie_reviews.words()) word_features = list(all_words.keys())[:2000] def document_features(document): document_words = set(document) features = {} for word in word_features: features['contains({})'.format(word)] = (word in document_words) return features featuresets = [(document_features(d), c) for (d,c) in documents] train_set, test_set = featuresets[200:], featuresets[:200] classifier = nltk.NaiveBayesClassifier.train(train_set)
This classifier is based on the Naive Bayes algorithm and uses the characteristics of the training data for classification. In this example, the "contains (word)" function that characterizes word forms is used. This function checks whether the document contains the word.
After completing the establishment of the model, you can use it to perform sentiment analysis. In this example, Twitter API is used to obtain tweets from Twitter, and then sentiment analysis is performed on the tweets.
from TwitterAPI import TwitterAPI import json consumer_key = 'your consumer key' consumer_secret = 'your consumer secret' access_token_key = 'your access token key' access_token_secret = 'your access token secret' api = TwitterAPI(consumer_key, consumer_secret, access_token_key, access_token_secret) def analyze_tweet(tweet): tweet_text = tweet['text'] tweet_clean = clean_text(tweet_text) tweet_features = document_features(tweet_clean.split()) sentiment = classifier.classify(tweet_features) return sentiment keywords = 'Trump' for tweet in api.request('search/tweets', {'q': keywords, 'lang': 'en', 'count': 10}): sentiment = analyze_tweet(tweet) print(tweet['text']) print(sentiment) print(' ')
This code snippet uses TwitterAPI to get the latest 10 tweets containing the keyword "Trump". Then, sentiment analysis is performed on each tweet and the sentiment tendency is output.
In addition to Twitter, this model can also be used to perform sentiment analysis on other text data.
Conclusion
This article introduces examples of sentiment analysis in Python. This example uses a trained Naive Bayes classifier to classify text, which can be used to determine the emotional tendency of the text. Sentiment analysis can be widely used in areas such as marketing and social media monitoring.
The above is the detailed content of Natural Language Processing Examples in Python: Sentiment Analysis. For more information, please follow other related articles on the PHP Chinese website!