Program to find length of longest substring with character count of at least k in Python

Finding the longest substring where each character appears at least k times is a classic divide and conquer problem. We need to recursively split the string at characters that don't meet the frequency requirement.

So, if the input is like s = "aabccddeeffghij" k = 2, then the output will be 8, as the longest substring here is "ccddeeff" where every character occurs at least 2 times.

Algorithm

To solve this, we will follow these steps ?

  • Count frequency of all characters in the string
  • If all characters occur at least k times, return the length of the string
  • Otherwise, split the string at characters that occur less than k times
  • Recursively find the longest valid substring from each split
  • Return the maximum length found

Implementation

from collections import Counter

class Solution:
    def solve(self, s, k):
        def find_longest(chars):
            # Count frequency of each character
            char_count = Counter(chars)
            acc = []
            ans = 0
            valid = True
            
            for char in chars:
                if char_count[char] < k:
                    # Character doesn't meet frequency requirement
                    valid = False
                    # Recursively check the accumulated substring
                    ans = max(ans, find_longest(acc))
                    acc = []
                else:
                    # Add valid character to current substring
                    acc.append(char)
            
            if valid:
                # All characters meet the requirement
                return len(acc)
            else:
                # Check the last accumulated substring
                ans = max(ans, find_longest(acc))
                return ans
        
        return find_longest(list(s))

# Test the solution
ob = Solution()
s = "aabccddeeffghij"
k = 2
print(ob.solve(s, k))

The output of the above code is ?

8

How It Works

The algorithm works by using divide and conquer approach:

  • First, it counts the frequency of each character using Counter
  • It iterates through the string, building valid substrings
  • When it encounters a character with frequency less than k, it splits the string
  • It recursively processes each valid substring and returns the maximum length

Example Walkthrough

For string "aabccddeeffghij" with k=2:

  • Characters 'a', 'b', 'g', 'h', 'i', 'j' appear only once (less than k=2)
  • Valid characters are 'c', 'd', 'e', 'f' which appear at least twice
  • The longest valid substring is "ccddeeff" with length 8

Time Complexity

The time complexity is O(n * 26) in the worst case, where n is the length of the string. The space complexity is O(n) for the recursion stack.

Conclusion

This divide and conquer approach efficiently finds the longest substring where each character appears at least k times. The key insight is to split the string at invalid characters and recursively solve smaller subproblems.

Updated on: 2026-03-25T12:37:26+05:30

318 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements