Problem
A trie (pronounced as “try”) or prefix tree is a tree data structure used to efficiently store and retrieve keys in a dataset of strings. There are various applications of this data structure, such as autocomplete and spellchecker.
Implement the Trie class:
Trie()
Initializes the trie object.void insert(String word)
Inserts the stringword
into the trie.boolean search(String word)
Returnstrue
if the stringword
is in the trie (i.e., was inserted before), andfalse
otherwise.boolean startsWith(String prefix)
Returnstrue
if there is a previously inserted stringword
that has the prefixprefix
, andfalse
otherwise.
Examples
Example 1:
|
|
Solution
Video explanation
Here is the video explaining below implementations in detail. Please check it out:
Method 1 - Using Map-based Trie
Intuition
The most intuitive way to store a list of words would be in a HashSet
or a List
. While this works, it’s inefficient for a key part of the problem: checking for prefixes (startsWith
). With a HashSet
or List
, the only way to check for a prefix is to iterate through every single word and see if it starts with the given prefix. This is slow, especially with many words.
The core intuition behind a Trie (Prefix Tree) is to create a more specialized data structure that excels at this. Instead of storing words as standalone strings, we can break them down and store them character by character in a tree. The key insight is that we can share common prefixes. For words like “apple”, “apply”, and “application”, the initial path “a-p-p-l” is stored only once.
Each node in the tree represents a single character. A path from the root to any node represents a prefix. To distinguish between a prefix that is also a complete word (like “app”) and one that is not (like “appl”), we add a boolean flag (isEndOfWord
) to each node.
This structure means that all operations (insert
, search
, startsWith
) become proportional only to the length of the word or prefix in question, not the total number of words stored.
Approach
Our approach is to build a Trie
class that uses a helper TrieNode
class.
TrieNode class
- This class represents a single node in the trie.
- It contains a collection (e.g., a
HashMap
or an array of size 26) to store itschildren
nodes. - It contains an
isEndOfWord
boolean flag, initialized tofalse
.
Trie Class
When the Trie
is created, we initialize it with a single, empty root
TrieNode
. This root
acts as the starting point for all operations.
insert(word)
Here is how we can insert in the trie:
- Start at the root.
- Search for the word’s first character among the root’s children.
- For instance, with “AGRA” being the initial insertion, create a ‘A’ child for the root, followed by its descendants ‘G’, ‘R’, and ‘A’.
- Proceed to the corresponding child node in the Trie and the next letter in the word if such a node is present.
- For example, when inserting “AYODHYA” after ‘AGRA’, start at the root and locate the existing ‘A’ node. Moving to the ‘A’ node, search for ‘Y’ among its children. Absence of ‘Y’ implies it needs to be created as a new child of ‘A’.
- On encountering a null child and additional unsaved characters in the word, sequentially append them as children. Thus ‘Y’ would be linked under ‘A’, ‘O’ under ‘Y’, and so forth.
- At the end of word, set
isEnd
to true.
search(word)
- Start with a pointer at the
root
. - Iterate through each character of the
word
. - For each character, check if a corresponding child node exists. If at any point a node is missing, the word is not in the trie, so return
false
. - Move the pointer down the path.
- If the loop completes, it means the path exists. Now, we must return the value of the
isEndOfWord
flag of the final node. This ensures we are matching a complete word and not just a prefix.
startsWith(prefix)
- This is almost identical to
search
. - Start with a pointer at the
root
and iterate through the characters of theprefix
. - If any character’s node is missing along the path, return
false
. - If the loop completes, it means the path for the prefix exists. We can immediately return
true
without needing to check theisEndOfWord
flag.
Code
|
|
|
|
Complexity
- ⏰ Time complexity:
O(L)
. For any operation—insert
,search
, orstartsWith
—we simply trace a path for the given word. This means performance is proportional to the length of the word,L
, not the total number of words in the trie. - 🧺 Space complexity:
O(N*L)
. In the worst-case scenario (if no words share any prefixes), we have to store every character of every word. However, the space is often much better in practice thanks to prefix sharing. For example, after storing “apple”, storing “app” requires no new nodes, just flipping a flag.
Method 2 - Using Array-based Trie
As we have only 26 character set, we can use a array of those alphabets.
Code
|
|
Complexity
- ⏰ Time complexity:
O(L)
- 🧺 Space complexity:
O(n*L)
Comparison with other data structures
List Data structure
For e.g. for the list,
insert
: We can just add a word to the list. That’s fast.search
: We have to scan the list to find an exact match.startsWith
: This is the real problem. We’d have to iterate through every single word in our list and call thestartsWith()
method on it. If we have a million words, we do a million checks. Every. Single. Time. This is too slow.
HashMap Data structure
Same would have been a problem with the Hashmap. Of course, with insert
and exact search
, a HashMap is very fast as its lookup takes O(1)
time. However, for thestartsWith
implementation, say we have to check for the prefix "app"
, you would still have to iterate over every key in the HashMap and check them one by one. We’re back to the same problem as the list.
Summary
Operation | List | HashMap | Trie |
---|---|---|---|
insert(word) |
O(1) |
O(L) |
O(L) |
search(word) |
O(N * L) |
O(L) |
O(L) |
startsWith(prefix) |
O(N * L) |
O(N * L) |
O(L) |
Where N
is the number of words and L
is the length of the word/prefix.