340 Longest Substring with At Most K Distinct Characters

Given a string, find the length of the longest substring T that contains at mostkdistinct characters.

For example, Given s =“eceba”and k = 2,

T is "ece" which its length is 3.

Brute Force

The Idea: Produce all possible substring combinations, and output the largest distinct element.

Complexity: O(n^3) time - O(n^2) for generating all substrings and then an extra O(n) checking for distinctness

vector<string> sub_str_sub(string s) 
{
    vector<string> substrings;
    for (int i = 0; i < s.length(); i++) {
        for (int j = 1; j <= s.length(); j++) {
            substrings.push_back(s.substr(i, j));
        }
        break;
    }

    return substrings;
}

vector<vector<string>> all_substrings(string &s)
{
    vector<vector<string>> all;
    size_t size = s.length();
    for (int i = 0; i < s.length(); i++) {
        all.push_back(sub_str_sub(s.substr(i, size--)));
    }
    return all;
}

int is_k_distinct(string &s, int k)
{
    unordered_set<char> unique_m;

    for (int i = 0; i < s.length(); i++) {
        unique_m.insert(s[i]);
        if (unique_m.size() > k) return false;
    }
    return true;
}

int lengthOfLongestSubstringKDistinct(string s, int k) 
{
    int maxx = 0;
    vector<vector<string>> substrings = all_substrings(s);

    for (int i = 0; i < substrings.size(); i++) {
        for (int j = 0; j < substrings[i].size(); j++) {
            if (is_k_distinct(substrings[i][j], k))
                maxx = max(maxx, (int)substrings[i][j].length());
        }
    }
    return maxx;
}

Better

The Idea: Keep a left and right pointer in the string, and maintain the most recent index of the string into a hash table. Iterate through the string, and once the hash table exceeds the length of k, then we need to do something in order be within the bounds of k distinct elements. At this point any element from the hash table can be removed, but the most optimal one to remove would be the one that maximize the distance between the left and right pointers. So select the minimum index, and set this as the new left, and remove that element from the dictionary.

Complexity: O(n*k) time and O(n) space where n is the length of the string and k is the number of distinct characters allowed. This is because for every element in s, we may have to find the minimum element in the hash table, which will at most have k elements.

class Solution:
    def lengthOfLongestSubstringKDistinct(self, s, k):
        """
        :type s: str
        :type k: int
        :rtype: int
        """
        if not s or k == 0:
            return 0

        d = {}
        left, max_k_dist = 0, 0
        for i, c in enumerate(s):
            d[c] = i
            if len(d) > k:
                min_left = min(d.values())
                left = min_left
                del d[s[left]]
                left += 1  # to ignore the character we removed
            max_k_dist = max(max_k_dist, i - left + 1)

        return max_k_dist

Last updated