Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to find length of longest substring with character count of at least k in Python
Finding the longest substring where each character appears at least k times is a classic divide and conquer problem. We need to recursively split the string at characters that don't meet the frequency requirement.
So, if the input is like s = "aabccddeeffghij" k = 2, then the output will be 8, as the longest substring here is "ccddeeff" where every character occurs at least 2 times.
Algorithm
To solve this, we will follow these steps ?
- Count frequency of all characters in the string
- If all characters occur at least k times, return the length of the string
- Otherwise, split the string at characters that occur less than k times
- Recursively find the longest valid substring from each split
- Return the maximum length found
Implementation
from collections import Counter
class Solution:
def solve(self, s, k):
def find_longest(chars):
# Count frequency of each character
char_count = Counter(chars)
acc = []
ans = 0
valid = True
for char in chars:
if char_count[char] < k:
# Character doesn't meet frequency requirement
valid = False
# Recursively check the accumulated substring
ans = max(ans, find_longest(acc))
acc = []
else:
# Add valid character to current substring
acc.append(char)
if valid:
# All characters meet the requirement
return len(acc)
else:
# Check the last accumulated substring
ans = max(ans, find_longest(acc))
return ans
return find_longest(list(s))
# Test the solution
ob = Solution()
s = "aabccddeeffghij"
k = 2
print(ob.solve(s, k))
The output of the above code is ?
8
How It Works
The algorithm works by using divide and conquer approach:
- First, it counts the frequency of each character using
Counter - It iterates through the string, building valid substrings
- When it encounters a character with frequency less than k, it splits the string
- It recursively processes each valid substring and returns the maximum length
Example Walkthrough
For string "aabccddeeffghij" with k=2:
- Characters 'a', 'b', 'g', 'h', 'i', 'j' appear only once (less than k=2)
- Valid characters are 'c', 'd', 'e', 'f' which appear at least twice
- The longest valid substring is "ccddeeff" with length 8
Time Complexity
The time complexity is O(n * 26) in the worst case, where n is the length of the string. The space complexity is O(n) for the recursion stack.
Conclusion
This divide and conquer approach efficiently finds the longest substring where each character appears at least k times. The key insight is to split the string at invalid characters and recursively solve smaller subproblems.
