Home > Backend Development > Python Tutorial > How to Count Word Frequency and Sort by Frequency in Python?

How to Count Word Frequency and Sort by Frequency in Python?

Barbara Streisand
Release: 2024-10-21 21:39:03
Original
988 people have browsed it

How to Count Word Frequency and Sort by Frequency in Python?

Counting Word Frequency and Sorting by Frequency

When working with large datasets containing text data, it's often necessary to analyze the frequency of individual words. This information can be used for various natural language processing (NLP) tasks. In Python, this task can be simplified using a powerful tool called Counter.

Implementing the Design

Your design outlines the following steps:

  1. Create an empty list to store unique words (newlst).
  2. Create an empty list to store corresponding word frequencies (frequency).
  3. Iterate through the original list of words.
  4. For each word, check if it's already in newlst.
  5. If the word is not in newlst, add it and set the frequency to 1.
  6. If the word is already in newlst, increment its frequency.
  7. Sort newlst based on the frequency list.

Using Counter in Python

Python's collections module provides a specialized class called Counter, which is designed for counting and aggregating elements in iterables. Counter allows us to perform steps 3-6 in a single line of code. Here's how you can implement your design using Counter:

<code class="python">from collections import Counter

# Create a Counter from the list of words
counts = Counter(original_list)

# Sort the keys (unique words) based on their frequencies
sorted_words = sorted(counts.keys(), key=lambda x: counts[x], reverse=True)</code>
Copy after login

This code generates a sorted list of unique words, where the word with the highest frequency appears first.

Example

<code class="python">list1 = ['the', 'car', 'apple', 'banana', 'car', 'apple']
counts = Counter(list1)
print(counts)  # Counter({'apple': 2, 'car': 2, 'banana': 1, 'the': 1})
sorted_words = sorted(counts.keys(), key=lambda x: counts[x], reverse=True)
print(sorted_words)  # ['apple', 'car', 'banana', 'the']</code>
Copy after login

The above is the detailed content of How to Count Word Frequency and Sort by Frequency in Python?. For more information, please follow other related articles on the PHP Chinese website!

source:php
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template