Problem

Given a string s, remove duplicate letters so that every letter appears once and only once. You must make sure your result is the smallest in lexicographical order among all possible results.

Examples

Example 1:

Input: s = "bcabc"
Output: "abc"

Example 2:

Input: s = "cbacdcbc"
Output: "acdb"

Solution

Method 1 - Using Stack, Visited Set and Last Index of Char in String

Let’s consider the example cbacdcbc, where the output is acdb. Our goal is to ensure the result is in lexicographical order. When we select a character, we must compare it with the previous character to maintain the correct order. A stack can be used to achieve this.

Algorithm

Here are the data structures used:

  • If the stack is empty, we’ll push the current character onto the stack.
  • We’ll maintain a boolean array to mark whether a character has been seen before. This helps us skip characters that have already been included. The boolean array will have a length of 26 to represent all lowercase alphabets.

Here is the approach:

  1. Track Last Index:
    • Create an array to store the last index of each character in the string.
    • Traverse the string and update the last index accordingly.
  2. Initialize Seen Array and Stack:
    • Create a boolean array to keep track of characters that have already been added to the stack.
    • Initialize an empty stack to manage characters.
  3. Process Each Character:
    • For each character in the string:
      • Skip it if it has already been added (identified using the seen array).
      • While the stack is not empty and the current character is smaller than the character at the top of the stack, and the character at the top of the stack can still appear later (using lastIndex to check):
        • Pop the character at the top of the stack, mark it as unseen.
      • Push the current character onto the stack and mark it as seen.
  4. Build Result String:
    • Create a StringBuilder.
    • Pop characters from the stack and append them to the StringBuilder.
    • Reverse the StringBuilder to get the final result.

Code

Java
public class Solution {
    public String removeDuplicateLetters(String s) {
        int[] lastIndex = new int[26];
        for (int i = 0; i < s.length(); i++) {
            lastIndex[s.charAt(i) - 'a'] = i; // Track the lastIndex of character presence
        }

        boolean[] seen = new boolean[26]; // Keep track of seen characters
        Stack<Character> stack = new Stack<>();

        for (int i = 0; i < s.length(); i++) {
            char curr = s.charAt(i);
            if (seen[curr - 'a']) continue; // Skip if character is already seen
            while (!stack.isEmpty() && stack.peek() > curr && i < lastIndex[stack.peek() - 'a']) {
                seen[stack.pop() - 'a'] = false; // Pop from stack and mark unseen
            }
        stack.push(curr); // Add current character to stack
            seen[curr - 'a'] = true; // Mark current character as seen
        }

        StringBuilder sb = new StringBuilder();
        while (!stack.isEmpty()) {
            sb.append(stack.pop());
        }
        return sb.reverse().toString();
    }
}
Python
class Solution:
    def removeDuplicateLetters(self, s: str) -> str:
        last_index = {c: i for i, c in enumerate(s)}
        seen = [False] * 26
        stack = []

        for i, curr in enumerate(s):
            if seen[ord(curr) - ord('a')]:
                continue
            while stack and stack[-1] > curr and i < last_index[stack[-1]]:
                seen[ord(stack.pop()) - ord('a')] = False
            stack.append(curr)
            seen[ord(curr) - ord('a')] = True

        return ''.join(stack)

Complexity

  • ⏰ Time complexity:  O(n), where n is the length of the input string s. Each character is processed once, and each stack operation (push/pop) is O(1).
  • 🧺 Space complexity: O(1) for the boolean array and last index array which are constant sized. The space for the stack is O(n) in the worst case for storing characters.

Dry Run

Lets take the string cbacdcbc, from example 2.

  • First we see c, and push it onto the stack and mark it as seen.
  • Then we see b and check if b < c to maintain lexicographically order. Since b is smaller than c, we remove c from the stack.
    • But, before removing c, check if there are more instances of c in the string. To do so quickly, we can use an array to keep track of the last index of each character.
    • For c the last index is 7th index.
    • So, we’ll remove c from the stack and mark it as unseen.
    • Now, we push b onto the stack and mark it as seen.
  • Next we see a, which is smaller then b,
    • For b last index is 6
    • So, we remove b from stack and mark it as unseen
    • Now, we push a onto the stack and mark it as seen.
  • Then, we see c, as c > a, we add it to stack and mark it as seen
  • Then, we see d, as d > c, we add it to stack and mark it as seen
  • Then, we encounter c, which is already visited, so we skip
  • Now, we see b. Now, b < d (on stack), but d is no longer present after b in input string (as last index of d is 4, which is less then current index), so we cannot pop d out.So, we add b onto stack and mark it as seen.
  • Finally, we come to c, which is already seen, so we skip.

Now, in stack we “acdb” in stack, so we pop them out and reverse. Here is the GIF: