Counting Word Frequency and Sorting by Frequency
When working with large datasets containing text data, it's often necessary to analyze the frequency of individual words. This information can be used for various natural language processing (NLP) tasks. In Python, this task can be simplified using a powerful tool called Counter.
Implementing the Design
Your design outlines the following steps:
Using Counter in Python
Python's collections module provides a specialized class called Counter, which is designed for counting and aggregating elements in iterables. Counter allows us to perform steps 3-6 in a single line of code. Here's how you can implement your design using Counter:
<code class="python">from collections import Counter # Create a Counter from the list of words counts = Counter(original_list) # Sort the keys (unique words) based on their frequencies sorted_words = sorted(counts.keys(), key=lambda x: counts[x], reverse=True)</code>
This code generates a sorted list of unique words, where the word with the highest frequency appears first.
Example
<code class="python">list1 = ['the', 'car', 'apple', 'banana', 'car', 'apple'] counts = Counter(list1) print(counts) # Counter({'apple': 2, 'car': 2, 'banana': 1, 'the': 1}) sorted_words = sorted(counts.keys(), key=lambda x: counts[x], reverse=True) print(sorted_words) # ['apple', 'car', 'banana', 'the']</code>
The above is the detailed content of How to Count Word Frequency and Sort by Frequency in Python?. For more information, please follow other related articles on the PHP Chinese website!