Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to find the largest grouping of anagrams from a word list in Python
Suppose we have a list of strings words, we have to group all anagrams together and return the size of the largest grouping. Anagrams are words formed by rearranging the letters of another word, like "xyz" and "zyx".
So, if the input is like words = ["xy", "yx", "xyz", "zyx", "yzx", "wwwww"], then the output will be 3, as ["xyz", "zyx", "yzx"] is the largest grouping.
Algorithm
To solve this, we will follow these steps ?
Create a dictionary to store anagram groups
Initialize result as 0
-
For each word in the list:
Sort the characters to create a canonical form
Use sorted form as dictionary key and count occurrences
Update maximum group size
Return the maximum group size
Implementation
Let us see the following implementation to get better understanding ?
class Solution:
def solve(self, words):
lookup = {}
res = 0
for word in words:
# Sort characters to get canonical form
sorted_word = "".join(sorted(word))
lookup[sorted_word] = lookup.get(sorted_word, 0) + 1
res = max(res, lookup[sorted_word])
return res
# Test the solution
ob = Solution()
words = ["xy", "yx", "xyz", "zyx", "yzx", "wwwww"]
print(ob.solve(words))
3
How It Works
The algorithm works by using sorted characters as a key to group anagrams:
"xy" and "yx" both become "xy" when sorted
"xyz", "zyx", and "yzx" all become "xyz" when sorted
"wwwww" remains "wwwww" when sorted
Alternative Approach Using Collections
We can simplify the code using Python's defaultdict ?
from collections import defaultdict
def largest_anagram_group(words):
anagram_groups = defaultdict(int)
max_size = 0
for word in words:
sorted_word = "".join(sorted(word))
anagram_groups[sorted_word] += 1
max_size = max(max_size, anagram_groups[sorted_word])
return max_size
# Test the solution
words = ["xy", "yx", "xyz", "zyx", "yzx", "wwwww"]
print(largest_anagram_group(words))
3
Conclusion
The key insight is using sorted characters as a canonical representation for anagram groups. This approach has O(n * m log m) time complexity, where n is the number of words and m is the average word length.
