De Bruijn Sequence

Problem

Given a set of characters C and an integer k, a De Bruijn sequence is a cyclic sequence in which every possible k-length string of characters in C occurs exactly once.

For example, suppose C = {0, 1} and k = 3. Then our sequence should contain the substrings {'000', '001', '010', '011', '100', '101', '110', '111'}, and one possible solution would be 00010111.

Create an algorithm that finds a De Bruijn sequence.

Examples

Example 1

Input: C = ['0', '1'], k = 3
Output: "00010111"
Explanation: This sequence contains all possible 3-length strings: 000, 001, 010, 101, 011, 111, 110, 100 (reading cyclically).

Example 2

Input: C = ['A', 'B'], k = 2
Output: "AABBA"
Explanation: Contains all 2-length strings: AA, AB, BB, BA (reading cyclically).

Example 3

Input: C = ['0', '1', '2'], k = 2
Output: "001122021"
Explanation: Contains all 2-length strings with characters 0, 1, 2.

Solution

Method 1 - Hierholzer's Algorithm (Eulerian Path)

Intuition

A De Bruijn sequence can be found by constructing a directed graph where each node represents a (k-1)-length string, and each edge represents a k-length string. Finding an Eulerian path in this graph gives us the De Bruijn sequence. This is the most elegant and theoretically sound approach.

Approach

Create all possible (k-1)-length strings as nodes
For each k-length string, add a directed edge from its first (k-1) characters to its last (k-1) characters
Find an Eulerian path/cycle in this graph using Hierholzer's algorithm
The sequence of edge labels forms the De Bruijn sequence

Code

C++

class Solution {
public:
    string deBruijnSequence(vector<char>& chars, int k) {
        if (k == 1) {
            string result;
            for (char c : chars) result += c;
            return result;
        }
        
        unordered_set<string> visited;
        string startNode(k - 1, chars[0]);
        string result;
        
        dfs(startNode, chars, k, visited, result);
        return result + startNode;
    }
    
private:
    void dfs(const string& node, vector<char>& chars, int k, 
             unordered_set<string>& visited, string& result) {
        for (char c : chars) {
            string edge = node + c;
            if (visited.find(edge) == visited.end()) {
                visited.insert(edge);
                string nextNode = edge.substr(1);
                dfs(nextNode, chars, k, visited, result);
                result += c;
            }
        }
    }
};

Go

func deBruijnSequence(chars []rune, k int) string {
    if k == 1 {
        result := make([]rune, len(chars))
        copy(result, chars)
        return string(result)
    }
    
    visited := make(map[string]bool)
    startNode := strings.Repeat(string(chars[0]), k-1)
    var result []rune
    
    dfs(startNode, chars, k, visited, &result)
    return string(result) + startNode
}

func dfs(node string, chars []rune, k int, visited map[string]bool, result *[]rune) {
    for _, c := range chars {
        edge := node + string(c)
        if !visited[edge] {
            visited[edge] = true
            nextNode := edge[1:]
            dfs(nextNode, chars, k, visited, result)
            *result = append(*result, c)
        }
    }
}

Java

class Solution {
    public String deBruijnSequence(char[] chars, int k) {
        if (k == 1) {
            return new String(chars);
        }
        
        Set<String> visited = new HashSet<>();
        String startNode = new String(new char[k - 1]).replace('\0', chars[0]);
        StringBuilder result = new StringBuilder();
        
        dfs(startNode, chars, k, visited, result);
        return result.toString() + startNode;
    }
    
    private void dfs(String node, char[] chars, int k, 
                     Set<String> visited, StringBuilder result) {
        for (char c : chars) {
            String edge = node + c;
            if (!visited.contains(edge)) {
                visited.add(edge);
                String nextNode = edge.substring(1);
                dfs(nextNode, chars, k, visited, result);
                result.append(c);
            }
        }
    }
}

Python

class Solution:
    def de_bruijn_sequence(self, chars: List[str], k: int) -> str:
        if k == 1:
            return ''.join(chars)
        
        visited: Set[str] = set()
        start_node = chars[0] * (k - 1)
        result: List[str] = []
        
        self._dfs(start_node, chars, k, visited, result)
        return ''.join(result) + start_node
    
    def _dfs(self, node: str, chars: List[str], k: int, 
             visited: Set[str], result: List[str]) -> None:
        for c in chars:
            edge = node + c
            if edge not in visited:
                visited.add(edge)
                next_node = edge[1:]
                self._dfs(next_node, chars, k, visited, result)
                result.append(c)

Complexity

⏰ Time complexity: O(|C|^k), where |C| is the number of characters and k is the length. We visit each k-length string exactly once.
🧺 Space complexity: O(|C|^k), for storing the visited set and recursion stack.

Method 2 - Backtracking with Greedy Selection

Intuition

We can build the De Bruijn sequence incrementally by maintaining a set of seen k-length substrings and always trying to add the lexicographically largest character that creates a new k-length substring.

Approach

Start with the lexicographically smallest k-length string
At each step, try to append characters in descending order
Check if the new k-length substring has been seen before
If not seen, add it and continue; otherwise try the next character
Stop when we can't add any more unique substrings

Code

C++

class Solution {
public:
    string deBruijnSequence(vector<char>& chars, int k) {
        sort(chars.begin(), chars.end());
        
        string result(k, chars[0]);
        unordered_set<string> seen;
        seen.insert(result);
        
        int totalSubstrings = pow(chars.size(), k);
        
        while (seen.size() < totalSubstrings) {
            string suffix = result.substr(result.length() - k + 1);
            bool found = false;
            
            for (int i = chars.size() - 1; i >= 0; i--) {
                string newSubstring = suffix + chars[i];
                if (seen.find(newSubstring) == seen.end()) {
                    result += chars[i];
                    seen.insert(newSubstring);
                    found = true;
                    break;
                }
            }
            
            if (!found) break;
        }
        
        return result;
    }
};

Java

class Solution {
    public String deBruijnSequence(char[] chars, int k) {
        Arrays.sort(chars);
        
        StringBuilder result = new StringBuilder();
        for (int i = 0; i < k; i++) {
            result.append(chars[0]);
        }
        
        Set<String> seen = new HashSet<>();
        seen.add(result.toString());
        
        int totalSubstrings = (int) Math.pow(chars.length, k);
        
        while (seen.size() < totalSubstrings) {
            String suffix = result.substring(result.length() - k + 1);
            boolean found = false;
            
            for (int i = chars.length - 1; i >= 0; i--) {
                String newSubstring = suffix + chars[i];
                if (!seen.contains(newSubstring)) {
                    result.append(chars[i]);
                    seen.add(newSubstring);
                    found = true;
                    break;
                }
            }
            
            if (!found) break;
        }
        
        return result.toString();
    }
}

Python

class Solution:
    def de_bruijn_sequence(self, chars: List[str], k: int) -> str:
        chars.sort()
        
        result = chars[0] * k
        seen: Set[str] = {result}
        
        total_substrings = len(chars) ** k
        
        while len(seen) < total_substrings:
            suffix = result[-(k-1):]
            found = False
            
            for c in reversed(chars):
                new_substring = suffix + c
                if new_substring not in seen:
                    result += c
                    seen.add(new_substring)
                    found = True
                    break
            
            if not found:
                break
        
        return result

Complexity

⏰ Time complexity: O(|C|^k * |C|), where we generate |C|^k substrings and for each, we try |C| characters.
🧺 Space complexity: O(|C|^k), for storing the seen substrings.

Method 3 - Graph-Based BFS Approach

Intuition

Similar to Method 1 but using BFS to find the Eulerian path. This can be more intuitive for some as it explicitly constructs the graph and then traverses it.

Approach

Build a graph where nodes are (k-1)-length strings
Edges represent k-length strings with their transitions
Use BFS to find an Eulerian path
Reconstruct the De Bruijn sequence from the path

Code

C++

class Solution {
public:
    string deBruijnSequence(vector<char>& chars, int k) {
        if (k == 1) {
            string result;
            for (char c : chars) result += c;
            return result;
        }
        
        // Build all (k-1)-length nodes
        vector<string> nodes;
        generateNodes("", k - 1, chars, nodes);
        
        // Build adjacency list
        unordered_map<string, vector<char>> graph;
        for (const string& node : nodes) {
            for (char c : chars) {
                graph[node].push_back(c);
            }
        }
        
        // Find Eulerian path
        string start = string(k - 1, chars[0]);
        vector<char> path;
        findEulerianPath(start, graph, path);
        
        // Build result
        string result = start;
        for (int i = path.size() - 1; i >= 0; i--) {
            result += path[i];
        }
        
        return result;
    }
    
private:
    void generateNodes(string current, int remaining, 
                       vector<char>& chars, vector<string>& nodes) {
        if (remaining == 0) {
            nodes.push_back(current);
            return;
        }
        
        for (char c : chars) {
            generateNodes(current + c, remaining - 1, chars, nodes);
        }
    }
    
    void findEulerianPath(const string& node, 
                          unordered_map<string, vector<char>>& graph,
                          vector<char>& path) {
        while (!graph[node].empty()) {
            char c = graph[node].back();
            graph[node].pop_back();
            string nextNode = node.substr(1) + c;
            findEulerianPath(nextNode, graph, path);
            path.push_back(c);
        }
    }
};

Java

class Solution {
    public String deBruijnSequence(char[] chars, int k) {
        if (k == 1) {
            return new String(chars);
        }
        
        // Build all (k-1)-length nodes
        List<String> nodes = new ArrayList<>();
        generateNodes("", k - 1, chars, nodes);
        
        // Build adjacency list
        Map<String, Stack<Character>> graph = new HashMap<>();
        for (String node : nodes) {
            graph.put(node, new Stack<>());
            for (int i = chars.length - 1; i >= 0; i--) {
                graph.get(node).push(chars[i]);
            }
        }
        
        // Find Eulerian path
        String start = new String(new char[k - 1]).replace('\0', chars[0]);
        List<Character> path = new ArrayList<>();
        findEulerianPath(start, graph, path);
        
        // Build result
        StringBuilder result = new StringBuilder(start);
        for (int i = path.size() - 1; i >= 0; i--) {
            result.append(path.get(i));
        }
        
        return result.toString();
    }
    
    private void generateNodes(String current, int remaining, 
                               char[] chars, List<String> nodes) {
        if (remaining == 0) {
            nodes.add(current);
            return;
        }
        
        for (char c : chars) {
            generateNodes(current + c, remaining - 1, chars, nodes);
        }
    }
    
    private void findEulerianPath(String node, 
                                  Map<String, Stack<Character>> graph,
                                  List<Character> path) {
        Stack<Character> edges = graph.get(node);
        while (!edges.isEmpty()) {
            char c = edges.pop();
            String nextNode = node.substring(1) + c;
            findEulerianPath(nextNode, graph, path);
            path.add(c);
        }
    }
}

Python

class Solution:
    def de_bruijn_sequence(self, chars: List[str], k: int) -> str:
        if k == 1:
            return ''.join(chars)
        
        # Build all (k-1)-length nodes
        nodes: List[str] = []
        self._generate_nodes("", k - 1, chars, nodes)
        
        # Build adjacency list
        graph: Dict[str, List[str]] = {}
        for node in nodes:
            graph[node] = chars[:]
        
        # Find Eulerian path
        start = chars[0] * (k - 1)
        path: List[str] = []
        self._find_eulerian_path(start, graph, path)
        
        # Build result
        result = start + ''.join(reversed(path))
        return result
    
    def _generate_nodes(self, current: str, remaining: int, 
                        chars: List[str], nodes: List[str]) -> None:
        if remaining == 0:
            nodes.append(current)
            return
        
        for c in chars:
            self._generate_nodes(current + c, remaining - 1, chars, nodes)
    
    def _find_eulerian_path(self, node: str, 
                            graph: Dict[str, List[str]], 
                            path: List[str]) -> None:
        while graph[node]:
            c = graph[node].pop()
            next_node = node[1:] + c
            self._find_eulerian_path(next_node, graph, path)
            path.append(c)

Complexity

⏰ Time complexity: O(|C|^k), for generating all nodes and finding the Eulerian path.
🧺 Space complexity: O(|C|^k), for storing the graph and nodes.

Method 4 - Lyndon Words Generation (Constant Amortized Time)

Intuition

This method generates De Bruijn sequences by creating Lyndon words in Constant Amortized Time (CAT). A Lyndon word is a string that is strictly smaller than any of its nontrivial rotations in lexicographic order. The algorithm is based on the FKM (Fredricksen, Kessler, and Maiorana) algorithm and generates the sequence very efficiently.

Approach

Generate all Lyndon words of length dividing n (where n is the length parameter)
Use a recursive approach with two parameters: t (current position) and p (period)
For each position, try the same character as t-p positions back
Then try all larger characters, each starting a new period
Concatenate all generated Lyndon words to form the De Bruijn sequence

The algorithm achieves constant amortized time by cleverly reusing previous computations and avoiding redundant work.

Code

C++

class Solution {
public:
    string deBruijnSequence(vector<char>& chars, int k) {
        sort(chars.begin(), chars.end());
        
        // Map characters to indices for easier processing
        unordered_map<char, int> charToIndex;
        for (int i = 0; i < chars.size(); i++) {
            charToIndex[chars[i]] = i;
        }
        
        string result;
        vector<int> sequence(k + 1, 0);
        generateLyndonWords(1, 1, chars.size(), k, sequence, result, chars);
        
        return result;
    }
    
private:
    void generateLyndonWords(int t, int p, int alphabetSize, int n, 
                             vector<int>& a, string& sequence, 
                             vector<char>& chars) {
        if (t > n) {
            if (n % p == 0) {
                for (int i = 1; i <= p; i++) {
                    sequence += chars[a[i]];
                }
            }
        } else {
            a[t] = a[t - p];
            generateLyndonWords(t + 1, p, alphabetSize, n, a, sequence, chars);
            
            for (int j = a[t - p] + 1; j < alphabetSize; j++) {
                a[t] = j;
                generateLyndonWords(t + 1, t, alphabetSize, n, a, sequence, chars);
            }
        }
    }
};

Go

func deBruijnSequenceLyndon(chars []rune, k int) string {
    sort.Slice(chars, func(i, j int) bool { return chars[i] < chars[j] })
    
    var result strings.Builder
    sequence := make([]int, k+1)
    
    generateLyndonWords(1, 1, len(chars), k, sequence, &result, chars)
    return result.String()
}

func generateLyndonWords(t, p, alphabetSize, n int, a []int, 
                         sequence *strings.Builder, chars []rune) {
    if t > n {
        if n%p == 0 {
            for i := 1; i <= p; i++ {
                sequence.WriteRune(chars[a[i]])
            }
        }
    } else {
        a[t] = a[t-p]
        generateLyndonWords(t+1, p, alphabetSize, n, a, sequence, chars)
        
        for j := a[t-p] + 1; j < alphabetSize; j++ {
            a[t] = j
            generateLyndonWords(t+1, t, alphabetSize, n, a, sequence, chars)
        }
    }
}

Java

class Solution {
    public String deBruijnSequence(char[] chars, int k) {
        Arrays.sort(chars);
        
        StringBuilder result = new StringBuilder();
        int[] sequence = new int[k + 1];
        generateLyndonWords(1, 1, chars.length, k, sequence, result, chars);
        
        return result.toString();
    }
    
    private void generateLyndonWords(int t, int p, int alphabetSize, int n,
                                     int[] a, StringBuilder sequence, char[] chars) {
        if (t > n) {
            if (n % p == 0) {
                for (int i = 1; i <= p; i++) {
                    sequence.append(chars[a[i]]);
                }
            }
        } else {
            a[t] = a[t - p];
            generateLyndonWords(t + 1, p, alphabetSize, n, a, sequence, chars);
            
            for (int j = a[t - p] + 1; j < alphabetSize; j++) {
                a[t] = j;
                generateLyndonWords(t + 1, t, alphabetSize, n, a, sequence, chars);
            }
        }
    }
}

Python

class Solution:
    def de_bruijn_sequence_lyndon(self, chars: List[str], k: int) -> str:
        chars.sort()
        
        result: List[str] = []
        sequence = [0] * (k + 1)
        
        self._generate_lyndon_words(1, 1, len(chars), k, sequence, result, chars)
        return ''.join(result)
    
    def _generate_lyndon_words(self, t: int, p: int, alphabet_size: int, n: int,
                               a: List[int], sequence: List[str], chars: List[str]) -> None:
        if t > n:
            if n % p == 0:
                for i in range(1, p + 1):
                    sequence.append(chars[a[i]])
        else:
            a[t] = a[t - p]
            self._generate_lyndon_words(t + 1, p, alphabet_size, n, a, sequence, chars)
            
            for j in range(a[t - p] + 1, alphabet_size):
                a[t] = j
                self._generate_lyndon_words(t + 1, t, alphabet_size, n, a, sequence, chars)

Complexity

⏰ Time complexity: O(|C|^k) amortized constant time per character generated. The algorithm is extremely efficient in practice.
🧺 Space complexity: O(k), for the recursion stack and temporary array storage.

Note on CAT (Constant Amortized Time): This algorithm achieves CAT by avoiding redundant computations. While individual operations may not be constant time, the average time per generated character approaches a constant as the sequence length increases. This is similar to dynamic array resizing in data structures like ArrayList in Java.

Real-World Example

For a 4-digit keypad lock (digits 0-9), the traditional brute force would require trying up to 4 × 10^4 = 40,000 combinations. Using a De Bruijn sequence, this reduces to just 10^4 + 3 = 10,003 button presses - a significant optimization for security testing.

Performance Comparison

Method	Time Complexity	Space Complexity	Best Use Case
Hierholzer's Algorithm	O(\|C\|^k)	O(\|C\|^k)	Theoretical understanding
Greedy Backtracking	O(\|C\|^k × \|C\|)	O(\|C\|^k)	Simple implementation
Graph-Based BFS	O(\|C\|^k)	O(\|C\|^k)	Educational purposes
Lyndon Words (CAT)	O(\|C\|^k) CAT	O(k)	Production systems

Applications

De Bruijn sequences have practical applications in:

Cryptography: PIN cracking and lock bypassing
Bioinformatics: DNA sequence analysis
Computer Science: Testing and pseudo-random number generation
Robotics: Path planning and coverage problems

For a k-digit lock with n possible digits, a De Bruijn sequence can reduce brute force attempts from k×n^k to n^k + k-1, significantly optimizing attack strategies.

References

Interesting read: http://www.learner.org/courses/mathilluminated/units/2/textbook/07.php

Problem

Examples

Example 1

Example 2

Example 3

Solution

Method 1 - Hierholzer's Algorithm (Eulerian Path)

Intuition

Approach

Code

C++

Go

Java

Python

Complexity

Method 2 - Backtracking with Greedy Selection

Intuition

Approach

Code

C++

Java

Python

Complexity

Method 3 - Graph-Based BFS Approach

Intuition

Approach

Code

C++

Java

Python

Complexity

Method 4 - Lyndon Words Generation (Constant Amortized Time)

Intuition

Approach

Code

C++

Go

Java

Python

Complexity

Real-World Example

Performance Comparison

Applications

References

Comments