How to Find the Most Repeated Word in a Text File using Python?

In this article, we will show you how to find the most repeated word in a given text file using Python. We'll use the Counter class from the collections module to efficiently count word frequencies.

Assume we have a text file named sample.txt containing some random text ?

Good Morning TutorialsPoint
This is TutorialsPoint sample File
Consisting of Specific
source codes in Python,Seaborn,Scala
Summary and Explanation
Welcome TutorialsPoint
Learn with a joy

Using Counter from Collections Module

The Counter class is a specialized dictionary that counts hashable objects. It's perfect for counting word frequencies ?

from collections import Counter

# Create sample text content (simulating file reading)
text_content = """Good Morning TutorialsPoint
This is TutorialsPoint sample File
Consisting of Specific
source codes in Python,Seaborn,Scala
Summary and Explanation
Welcome TutorialsPoint
Learn with a joy"""

# Split text into words and store in a list
words = []
for line in text_content.split('\n'):
    # Split each line into words
    line_words = line.split()
    words.extend(line_words)

# Count frequency of each word
word_frequency = Counter(words)

# Find the word with maximum frequency
most_repeated_word = word_frequency.most_common(1)[0]

print(f"Word frequencies: {dict(word_frequency)}")
print(f"'{most_repeated_word[0]}' is the most repeated word with {most_repeated_word[1]} occurrences")
Word frequencies: {'Good': 1, 'Morning': 1, 'TutorialsPoint': 3, 'This': 1, 'is': 1, 'sample': 1, 'File': 1, 'Consisting': 1, 'of': 1, 'Specific': 1, 'source': 1, 'codes': 1, 'in': 1, 'Python,Seaborn,Scala': 1, 'Summary': 1, 'and': 1, 'Explanation': 1, 'Welcome': 1, 'Learn': 1, 'with': 1, 'a': 1, 'joy': 1}
'TutorialsPoint' is the most repeated word with 3 occurrences

Reading from Actual File

Here's how to read from an actual text file and find the most repeated word ?

from collections import Counter

def find_most_repeated_word(filename):
    words = []
    
    # Open and read the file
    try:
        with open(filename, 'r') as file:
            for line in file:
                # Split each line into words and add to list
                line_words = line.strip().split()
                words.extend(line_words)
        
        # Count word frequencies
        word_frequency = Counter(words)
        
        # Find most common word
        if word_frequency:
            most_repeated = word_frequency.most_common(1)[0]
            return most_repeated[0], most_repeated[1]
        else:
            return None, 0
            
    except FileNotFoundError:
        print(f"File '{filename}' not found")
        return None, 0

# Usage
filename = "sample.txt"
word, frequency = find_most_repeated_word(filename)

if word:
    print(f"'{word}' is the most repeated word with {frequency} occurrences")
else:
    print("No words found in the file")

Case-Insensitive Word Counting

To handle words with different cases as the same word, convert to lowercase ?

from collections import Counter
import re

text_content = """Good Morning TutorialsPoint
This is TUTORIALSPOINT sample File
tutorialspoint is great
Welcome TutorialsPoint
Learn with a joy"""

# Extract words and convert to lowercase, remove punctuation
words = []
for line in text_content.split('\n'):
    # Use regex to find words (letters only)
    line_words = re.findall(r'\b[a-zA-Z]+\b', line.lower())
    words.extend(line_words)

# Count word frequencies
word_frequency = Counter(words)

# Find most repeated word
most_repeated = word_frequency.most_common(1)[0]

print(f"All words (lowercase): {words}")
print(f"'{most_repeated[0]}' appears {most_repeated[1]} times")
All words (lowercase): ['good', 'morning', 'tutorialspoint', 'this', 'is', 'tutorialspoint', 'sample', 'file', 'tutorialspoint', 'is', 'great', 'welcome', 'tutorialspoint', 'learn', 'with', 'a', 'joy']
'tutorialspoint' appears 4 times

Comparison of Methods

Method Advantages Use Case
Manual counting with loop Full control over logic Custom counting requirements
Counter.most_common() Built-in, efficient, simple Standard word frequency analysis
Case-insensitive with regex Handles punctuation and case Real-world text processing

Conclusion

Use the Counter class from collections module for efficient word counting. The most_common() method directly returns the most frequent words. For real-world applications, consider case-insensitive matching and punctuation removal using regex.

Updated on: 2026-03-26T21:27:05+05:30

9K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements