Problem

trie (pronounced as “try”) or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.

Implement the Trie class:

  • Trie() Initializes the trie object.
  • void insert(String word) Inserts the string word into the trie.
  • boolean search(String word) Returns true if the string word is in the trie (i.e., was inserted before), and false otherwise.
  • boolean startsWith(String prefix) Returns true if there is a previously inserted string word that has the prefix prefix, and false otherwise.

Examples

Example 1:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
Input:
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
Output:
[null, null, true, false, true, null, true]


Explanation:
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple");   // return True
trie.search("app");     // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app");     // return True

Solution

Video explanation

Here is the video explaining below implementations in detail. Please check it out:

Method 1 - Using Map-based Trie

Intuition

The most intuitive way to store a list of words would be in a HashSet or a List. While this works, it’s inefficient for a key part of the problem: checking for prefixes (startsWith). With a HashSet or List, the only way to check for a prefix is to iterate through every single word and see if it starts with the given prefix. This is slow, especially with many words.

The core intuition behind a Trie (Prefix Tree) is to create a more specialized data structure that excels at this. Instead of storing words as standalone strings, we can break them down and store them character by character in a tree. The key insight is that we can share common prefixes. For words like “apple”, “apply”, and “application”, the initial path “a-p-p-l” is stored only once.

Each node in the tree represents a single character. A path from the root to any node represents a prefix. To distinguish between a prefix that is also a complete word (like “app”) and one that is not (like “appl”), we add a boolean flag (isEndOfWord) to each node.

This structure means that all operations (insert, search, startsWith) become proportional only to the length of the word or prefix in question, not the total number of words stored.

Approach

Our approach is to build a Trie class that uses a helper TrieNode class.

TrieNode class
  • This class represents a single node in the trie.
  • It contains a collection (e.g., a HashMap or an array of size 26) to store its children nodes.
  • It contains an isEndOfWord boolean flag, initialized to false.
Trie Class

When the Trie is created, we initialize it with a single, empty root TrieNode. This root acts as the starting point for all operations.

insert(word)

Here is how we can insert in the trie:

  • Start at the root.
  • Search for the word’s first character among the root’s children.
    • For instance, with “AGRA” being the initial insertion, create a ‘A’ child for the root, followed by its descendants ‘G’, ‘R’, and ‘A’.
  • Proceed to the corresponding child node in the Trie and the next letter in the word if such a node is present.
    • For example, when inserting “AYODHYA” after ‘AGRA’, start at the root and locate the existing ‘A’ node. Moving to the ‘A’ node, search for ‘Y’ among its children. Absence of ‘Y’ implies it needs to be created as a new child of ‘A’.
  • On encountering a null child and additional unsaved characters in the word, sequentially append them as children. Thus ‘Y’ would be linked under ‘A’, ‘O’ under ‘Y’, and so forth.
  • At the end of word, set isEnd to true.
search(word)
  • Start with a pointer at the root.
  • Iterate through each character of the word.
  • For each character, check if a corresponding child node exists. If at any point a node is missing, the word is not in the trie, so return false.
  • Move the pointer down the path.
  • If the loop completes, it means the path exists. Now, we must return the value of the isEndOfWord flag of the final node. This ensures we are matching a complete word and not just a prefix.
startsWith(prefix)
  • This is almost identical to search.
  • Start with a pointer at the root and iterate through the characters of the prefix.
  • If any character’s node is missing along the path, return false.
  • If the loop completes, it means the path for the prefix exists. We can immediately return true without needing to check the isEndOfWord flag.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
// Now lets define Trie
class Trie {
	// Lets define TrieNode
    static class TrieNode {
        public Map<Character, TrieNode> children;
        public boolean isEndOfWord;

        public TrieNode() {
            children = new HashMap<>();
            isEndOfWord = false;
        }
    }

    private TrieNode root;
	
	/** Initialize your data structure here. */
    public Trie() {
        root = new TrieNode();
    }
	
	/** Inserts a word into the trie. */
    public void insert(String word) {
        TrieNode curr = root;
        for (char c : word.toCharArray()) {
            curr.children.putIfAbsent(c, new TrieNode());
            curr = curr.children.get(c);
        }
        curr.isEndOfWord = true;
    }
	
	/** Returns if the word is in the trie. */
    public boolean search(String word) {
        TrieNode curr = root;
        for (char c : word.toCharArray()) {
            if (!curr.children.containsKey(c)) {
                return false;
            }
            curr = curr.children.get(c);
        }
        return curr.isEndOfWord;
    }
	
	/** Returns if there is any word in the trie that starts with the given prefix. */
    public boolean startsWith(String prefix) {
        TrieNode curr = root;
        for (char c : prefix.toCharArray()) {
            if (!curr.children.containsKey(c)) {
                return false;
            }
            curr = curr.children.get(c);
        }
        return true;
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class TrieNode:
    def __init__(self):
        self.children: Dict[str, TrieNode] = {}
        self.is_end_of_word: bool = False

class Trie:
    def __init__(self):
        self.root = TrieNode()

    def insert(self, word: str) -> None:
        curr = self.root
        for char in word:
            if char not in curr.children:
                curr.children[char] = TrieNode()
            curr = curr.children[char]
        curr.is_end_of_word = True

    def search(self, word: str) -> bool:
        curr = self.root
        for char in word:
            if char not in curr.children:
                return False
            curr = curr.children[char]
        return curr.is_end_of_word

    def startsWith(self, prefix: str) -> bool:
        curr = self.root
        for char in prefix:
            if char not in curr.children:
                return False
            curr = curr.children[char]
        return True

Complexity

  • ⏰ Time complexity: O(L). For any operation—insert, search, or startsWith—we simply trace a path for the given word. This means performance is proportional to the length of the word, L, not the total number of words in the trie.
  • 🧺 Space complexity: O(N*L). In the worst-case scenario (if no words share any prefixes), we have to store every character of every word. However, the space is often much better in practice thanks to prefix sharing. For example, after storing “apple”, storing “app” requires no new nodes, just flipping a flag.

Method 2 - Using Array-based Trie

As we have only 26 character set, we can use a array of those alphabets.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
class Trie {
	static class TrieNode {
	    public TrieNode[] children;
	    public boolean isEndOfWord;
	
	    public TrieNode() {
	        children = new TrieNode[26];
	        isEndOfWord = false;
	    }
	}

    private TrieNode root;

    public Trie() {
        root = new TrieNode();
    }

    public void insert(String word) {
        TrieNode curr = root;
        for (char c : word.toCharArray()) {
            if (curr.children[c - 'a'] == null) {
                curr.children[c - 'a'] = new TrieNode();
            }
            curr = curr.children[c - 'a'];
        }
        curr.isEndOfWord = true;
    }

    public boolean search(String word) {
        TrieNode curr = root;
        for (char c : word.toCharArray()) {
            if (curr.children[c - 'a'] == null) {
                return false;
            }
            curr = curr.children[c - 'a'];
        }
        return curr.isEndOfWord;
    }

    public boolean startsWith(String prefix) {
        TrieNode curr = root;
        for (char c : prefix.toCharArray()) {
            if (curr.children[c - 'a'] == null) {
                return false;
            }
            curr = curr.children[c - 'a'];
        }
        return true;
    }
}

Complexity

  • ⏰ Time complexity: O(L)
  • 🧺 Space complexity: O(n*L)

Comparison with other data structures

List Data structure

For e.g. for the list,

  • insert: We can just add a word to the list. That’s fast.
  • search: We have to scan the list to find an exact match.
  • startsWith: This is the real problem. We’d have to iterate through every single word in our list and call the startsWith() method on it. If we have a million words, we do a million checks. Every. Single. Time. This is too slow.

HashMap Data structure

Same would have been a problem with the Hashmap. Of course, with insert and exact search, a HashMap is very fast as its lookup takes O(1) time. However, for thestartsWith implementation, say we have to check for the prefix "app", you would still have to iterate over every key in the HashMap and check them one by one. We’re back to the same problem as the list.

Summary

Operation List HashMap Trie
insert(word) O(1) O(L) O(L)
search(word) O(N * L) O(L) O(L)
startsWith(prefix) O(N * L) O(N * L) O(L)

Where N is the number of words and L is the length of the word/prefix.