[LeetCode] 3. Longest Substring Without Repeating Characters

Table of Content

Problem Statement
Problem Input
Solution

Problem statement

3. Longest Substring Without Repeating Characters

Given a string, find the length of the longest substring without repeating characters.

Example 1

Input: "abcabcbb"
Output: 3 
Explanation: The answer is "abc", with the length of 3.

Example 2

Input: "bbbbb"
Output: 1
Explanation: The answer is "b", with the length of 1.

Example 3

Input: "pwwkew"
Output: 3
Explanation: The answer is "wke", with the length of 3. 
             Note that the answer must be a substring, "pwke" is a subsequence and not a substring.

Input

class Solution:
    def lengthOfLongestSubstring(self, s: str) -> int:

Solution

Source code

Time Complexity: O(N)

Space Complexity: O(N)

There are two keys to this problem.

Have a way to keep track of the duplicated characters.
Have a way to kep track of the max length.

Note that in this problem, we’re looking for substring, not subsequence here, so this is a perfect candidate for the sliding window technique, where we utilize two pointers to track the left and right side of a window. Then we can essentially slide the window by shrinking or extending it based on different conditions.

We’ll use:

A set to help keep track of the duplicated character and a normal variable to keep track of the max length.
A while loop to iterate through the input string and update the set and the max length accordingly.
Two pointers, start, end for our sliding window technique.
A variable, maxLength to track the maximum substring length.

As we iterate through the while loop, we’ll:

Increment start and remove the character at character from the set if the current character is duplicated
Add the current character to the set, increment end, and update maxLength

class Solution:
  def lengthOfLongestSubstring(self, s: str) -> int:
    # Using a set to keep track of duplicated characters.
    duplicate = set()

    length = len(s)
    # Keep two iterators, start and end, and a variable to keep track of our current known longest length.
    maxLength = 0
    start = 0
    end = 0

    # Stop iterating if the end iterator reaches the end of the string, which means that we've exhausted all possibilities.
    while (end < length):
      currentStr = s[end]
      # If end iterator encounters a duplicate, reset the start iterator to the start + 1 iterator. 
      # Reason being that our current substring from start to end contains a inner substring that contains duplicated characters.
      # Because of this condition, we can't find a longer substring without sliding pass the first offending duplicated character.
      # Therefore we increment the start iterator to slide the leftside of our window to the right by one, effectively shrinking our window.
      # Because we're incrementing the start iterator, we also have to remember to remove its old character from the set.
      if (currentStr in duplicate):
        duplicate.remove(s[start])
        start += 1
      # If there's no duplicates encountered so far, add the characters to the set, extend the right side of our window by one, and update our maxLength
      else:
        duplicate.add(s[end])
        end += 1
        maxLength = max(maxLength, end - start)
    return maxLength

Complexity Explanation

Since we’re utilitize a single while loop with only constant operations inside, the time complexity is O(N), where N is the length of the input string.

Even though we removed duplicates from the set. the space complexity, at worst, can be O(N) in the case that input string has no repeating characters, so our set would contain every character.

Optimized solution

Source code

Time Complexity: O(N)

Space Complexity: O(N)

We can slightly optimize the above solution by changing the set into a hash table/dictionary and allow the start pointer to jump more than 1 position at a time.

By changing the set into a dictionary, we can now keep track of the duplicated character’s most recent index, which we will use to determine whether or not the start pointer can skip to. As a trick, we can also save the most recent index as most recent index + 1 to use as the start pointer’s new position.

As a result, we will have the following changes:

if the current string is a duplicated, move start to the duplicated character’s last recorded index + 1 if it is farther right than start’s current position. This will allow us to shrink the window significantly.
Increment end pointer, update maxLength and duplicated hash table.

class Solution2:
  def lengthOfLongestSubstring(self, s: str) -> int:
    # Using a dictionary to keep track of the duplicated character and its most recent index + 1.
    # Because we can ignore everything before the duplicate and move the start iterator to its index + 1.
    duplicate = {}

    length = len(s)
    # Keep two iterators, start and end, and a variable to keep track of our current known longest length.
    maxLength = 0
    start = 0
    end = 0

    while (end < length):
      currentStr = s[end]
      # If the end iterator encounters a duplicate, we can set the left side of our window as the max(duplicate's last recently recorded index, start)
      if currentStr in duplicate:
        start = max(duplicate[currentStr], start) 
      # If no duplicates, extend the right side of our sliding window, update maxLength and the dictionary.
      # If there's a duplicate inside our window, duplicate[currentStr], we can ignore everything before duplicate[currentStr] and set start with the next index.
      # Therefore, we set duplicate[currentStr] as end + 1
      end += 1
      maxLength = max(maxLength, end - start)
      duplicate[currentStr] = end
    return maxLength

Complexity Explanation

Even though, we optimized the previous solution, the big O time and space complexity stays the same. However, now start is able to skip more than 1 index at a time.