Implement Trie
Problem
A trie (pronounced as "try") or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.
Implement the Trie class:
Trie()Initializes the trie object.void insert(String word)Inserts the stringwordinto the trie.boolean search(String word)Returnstrueif the stringwordis in the trie (i.e., was inserted before), andfalseotherwise.boolean startsWith(String prefix)Returnstrueif there is a previously inserted stringwordthat has the prefixprefix, andfalseotherwise.
Examples
Example 1:
Input:
["Trie", "insert", "search", "search", "startsWith", "insert", "search"]
[[], ["apple"], ["apple"], ["app"], ["app"], ["app"], ["app"]]
Output:
[null, null, true, false, true, null, true]
Explanation:
Trie trie = new Trie();
trie.insert("apple");
trie.search("apple"); // return True
trie.search("app"); // return False
trie.startsWith("app"); // return True
trie.insert("app");
trie.search("app"); // return True
Solution
Video explanation
Here is the video explaining below implementations in detail. Please check it out:
<div class="youtube-embed"><iframe src="https://www.youtube.com/embed/XS5P3MnWRz0" frameborder="0" allowfullscreen></iframe></div>
Method 1 - Using Map-based Trie
Intuition
The most intuitive way to store a list of words would be in a HashSet or a List. While this works, it's inefficient for a key part of the problem: checking for prefixes (startsWith). With a HashSet or List, the only way to check for a prefix is to iterate through every single word and see if it starts with the given prefix. This is slow, especially with many words.
The core intuition behind a Trie (Prefix Tree) is to create a more specialized data structure that excels at this. Instead of storing words as standalone strings, we can break them down and store them character by character in a tree. The key insight is that we can share common prefixes. For words like "apple", "apply", and "application", the initial path "a-p-p-l" is stored only once.
Each node in the tree represents a single character. A path from the root to any node represents a prefix. To distinguish between a prefix that is also a complete word (like "app") and one that is not (like "appl"), we add a boolean flag (isEndOfWord) to each node.
This structure means that all operations (insert, search, startsWith) become proportional only to the length of the word or prefix in question, not the total number of words stored.
Approach
Our approach is to build a Trie class that uses a helper TrieNode class.
TrieNode class
- This class represents a single node in the trie.
- It contains a collection (e.g., a
HashMapor an array of size 26) to store itschildrennodes. - It contains an
isEndOfWordboolean flag, initialized tofalse.
Trie Class
When the Trie is created, we initialize it with a single, empty root TrieNode. This root acts as the starting point for all operations.
insert(word)
Here is how we can insert in the trie:
- Start at the root.
- Search for the word's first character among the root's children.
- For instance, with "AGRA" being the initial insertion, create a 'A' child for the root, followed by its descendants 'G', 'R', and 'A'.
- Proceed to the corresponding child node in the Trie and the next letter in the word if such a node is present.
- For example, when inserting "AYODHYA" after 'AGRA', start at the root and locate the existing 'A' node. Moving to the 'A' node, search for 'Y' among its children. Absence of 'Y' implies it needs to be created as a new child of 'A'.
- On encountering a null child and additional unsaved characters in the word, sequentially append them as children. Thus 'Y' would be linked under 'A', 'O' under 'Y', and so forth.
- At the end of word, set
isEndto true.
search(word)
- Start with a pointer at the
root. - Iterate through each character of the
word. - For each character, check if a corresponding child node exists. If at any point a node is missing, the word is not in the trie, so return
false. - Move the pointer down the path.
- If the loop completes, it means the path exists. Now, we must return the value of the
isEndOfWordflag of the final node. This ensures we are matching a complete word and not just a prefix.
startsWith(prefix)
- This is almost identical to
search. - Start with a pointer at the
rootand iterate through the characters of theprefix. - If any character's node is missing along the path, return
false. - If the loop completes, it means the path for the prefix exists. We can immediately return
truewithout needing to check theisEndOfWordflag.
Code
Java
// Now lets define Trie
class Trie {
// Lets define TrieNode
static class TrieNode {
public Map<Character, TrieNode> children;
public boolean isEndOfWord;
public TrieNode() {
children = new HashMap<>();
isEndOfWord = false;
}
}
private TrieNode root;
/** Initialize your data structure here. */
public Trie() {
root = new TrieNode();
}
/** Inserts a word into the trie. */
public void insert(String word) {
TrieNode curr = root;
for (char c : word.toCharArray()) {
curr.children.putIfAbsent(c, new TrieNode());
curr = curr.children.get(c);
}
curr.isEndOfWord = true;
}
/** Returns if the word is in the trie. */
public boolean search(String word) {
TrieNode curr = root;
for (char c : word.toCharArray()) {
if (!curr.children.containsKey(c)) {
return false;
}
curr = curr.children.get(c);
}
return curr.isEndOfWord;
}
/** Returns if there is any word in the trie that starts with the given prefix. */
public boolean startsWith(String prefix) {
TrieNode curr = root;
for (char c : prefix.toCharArray()) {
if (!curr.children.containsKey(c)) {
return false;
}
curr = curr.children.get(c);
}
return true;
}
}
Python
class TrieNode:
def __init__(self):
self.children: Dict[str, TrieNode] = {}
self.is_end_of_word: bool = False
class Trie:
def __init__(self):
self.root = TrieNode()
def insert(self, word: str) -> None:
curr = self.root
for char in word:
if char not in curr.children:
curr.children[char] = TrieNode()
curr = curr.children[char]
curr.is_end_of_word = True
def search(self, word: str) -> bool:
curr = self.root
for char in word:
if char not in curr.children:
return False
curr = curr.children[char]
return curr.is_end_of_word
def startsWith(self, prefix: str) -> bool:
curr = self.root
for char in prefix:
if char not in curr.children:
return False
curr = curr.children[char]
return True
Complexity
- ⏰ Time complexity:
O(L). For any operation—insert,search, orstartsWith—we simply trace a path for the given word. This means performance is proportional to the length of the word,L, not the total number of words in the trie. - 🧺 Space complexity:
O(N*L). In the worst-case scenario (if no words share any prefixes), we have to store every character of every word. However, the space is often much better in practice thanks to prefix sharing. For example, after storing "apple", storing "app" requires no new nodes, just flipping a flag.
Method 2 - Using Array-based Trie
As we have only 26 character set, we can use a array of those alphabets.
Code
Java
class Trie {
static class TrieNode {
public TrieNode[] children;
public boolean isEndOfWord;
public TrieNode() {
children = new TrieNode[26];
isEndOfWord = false;
}
}
private TrieNode root;
public Trie() {
root = new TrieNode();
}
public void insert(String word) {
TrieNode curr = root;
for (char c : word.toCharArray()) {
if (curr.children[c - 'a'] == null) {
curr.children[c - 'a'] = new TrieNode();
}
curr = curr.children[c - 'a'];
}
curr.isEndOfWord = true;
}
public boolean search(String word) {
TrieNode curr = root;
for (char c : word.toCharArray()) {
if (curr.children[c - 'a'] == null) {
return false;
}
curr = curr.children[c - 'a'];
}
return curr.isEndOfWord;
}
public boolean startsWith(String prefix) {
TrieNode curr = root;
for (char c : prefix.toCharArray()) {
if (curr.children[c - 'a'] == null) {
return false;
}
curr = curr.children[c - 'a'];
}
return true;
}
}
Complexity
- ⏰ Time complexity:
O(L) - 🧺 Space complexity:
O(n*L)
Comparison with other data structures
List Data structure
For e.g. for the list,
insert: We can just add a word to the list. That's fast.search: We have to scan the list to find an exact match.startsWith: This is the real problem. We'd have to iterate through every single word in our list and call thestartsWith()method on it. If we have a million words, we do a million checks. Every. Single. Time. This is too slow.
HashMap Data structure
Same would have been a problem with the Hashmap. Of course, with insert and exact search, a HashMap is very fast as its lookup takes O(1) time. However, for thestartsWith implementation, say we have to check for the prefix "app", you would still have to iterate over every key in the HashMap and check them one by one. We're back to the same problem as the list.
Summary
| Operation | List | HashMap | Trie |
|---|---|---|---|
insert(word) | O(1) | O(L) | O(L) |
search(word) | O(N * L) | O(L) | O(L) |
startsWith(prefix) | O(N * L) | O(N * L) | O(L) |
Where N is the number of words and L is the length of the word/prefix.