Problem
Gist: Given a sorted dictionary (array of words) of an alien language, find order of characters in the language.
Detailed: There is a new alien language which uses the latin alphabet. However, the order among letters are unknown to you. You receive a list of non-empty words from the dictionary, where words are sorted lexicographically by the rules of this new language. Derive the order of letters in this language.
Examples
Example 1:
Input:["wrt","wrf","er","ett","rftt"]
Output:"wertf"
Explanation:
from "wrt"and"wrf" ,we can get 't'<'f'
from "wrt"and"er" ,we can get 'w'<'e'
from "er"and"ett" ,we can get 'r'<'t'
from "ett"and"rftt" ,we can get 'e'<'r'
So return "wertf"
Example 2:
Input:["z","x"]
Output:"zx"
Explanation:
from "z" and "x",we can get 'z' < 'x'
So return "zx"
Solution
Method 1 - Topological Sorting in Graph
To understand task better:
- First of all, we need to understand that order of words is very important, since as it is told in the task: words are sorted lexicographically by the rules of this new language.
- That means that in case [“wrt”, “ert”] “w” should come before “e”.
- Also, we need to make sure that length of the word can also be different.
So, we have to form the graph based on the chars
in the list of words. Source will be char from word[i]
and destination will be char from word [i+x]
for eg.
Then if there is cycle, we return “”. If not, we return topological sort of string. Also, if the graph is not strongly connected, we can have multiple solutions.
Also, normal DFS will not work. For eg. Look at this graph:
graph LR; A --> B & C B --> C
We can get 2 answers - ABC, ACB. But ACB
is wrong, as C
is dependant on B
.
Hence, need to do something call post order DFS
. Look at Topological Sorting.
Approach
Build the Graph:
- Create a directed graph using adjacency lists to depict character precedence.
- Add edges from the first differing character of adjacent words.
DFS with Cycle Detection:
- Use a
visited
map (or array) having three states:0
: Not visited.1
: Visited but still in the recursive stack (used to detect cycles).2
: Fully processed.
- If a node is already being visited (state
1
), a cycle exists.
- Use a
Post-Order DFS for Topological Sort:
- Process all dependencies of a character before adding it to the result stack.
- Collect the characters in reverse topological order for proper sorting.
Code
Java
class Solution {
public String alienOrder(String[] words) {
// Step 1: Build the Graph
Map<Character, List<Character>> graph = new HashMap<>();
Map<Character, Integer> visited = new HashMap<>(); // 0 = not visited, 1 = visiting, 2 = processed
for (String word : words) {
for (char ch : word.toCharArray()) {
if (!graph.containsKey(ch)) {
graph.put(ch, new ArrayList<>());
visited.put(ch, 0);
}
}
}
for (int i = 0; i < words.length - 1; i++) {
String w1 = words[i], w2 = words[i + 1];
if (w1.length() > w2.length() && w1.startsWith(w2)) {
return ""; // Invalid case: prefix issue
}
for (int j = 0; j < Math.min(w1.length(), w2.length()); j++) {
char ch1 = w1.charAt(j), ch2 = w2.charAt(j);
if (ch1 != ch2) {
graph.get(ch1).add(ch2);
break;
}
}
}
// Step 2: DFS function
List<Character> ans = new ArrayList<>();
for (char ch : visited.keySet()) {
if (visited.get(ch) == 0 && !dfs(ch, graph, visited, ans)) {
return "";
}
}
// Step 3: Return result in reverse order
Collections.reverse(ans);
StringBuilder res = new StringBuilder();
for (char ch : ans) {
res.append(ch);
}
return res.toString();
}
private boolean dfs(char ch, Map<Character, List<Character>> graph, Map<Character, Integer> visited, List<Character> ans) {
if (visited.get(ch) == 1) { // Cycle detected
return false;
}
if (visited.get(ch) == 2) { // Already processed
return true;
}
visited.put(ch, 1); // Mark as visiting
for (char neighbour : graph.getOrDefault(ch, new ArrayList<>())) {
if (!dfs(neighbour, graph, visited, ans)) {
return false;
}
}
visited.put(ch, 2); // Mark as processed
ans.add(ch);
return true;
}
}
Python
class Solution:
def alienOrder(self, words: List[str]) -> str:
# Step 1: Build the Graph
graph: Dict[str, List[str]] = defaultdict(list)
visited: Dict[str, int] = {}
for word in words:
for char in word:
visited[char] = 0
for i in range(len(words) - 1):
w1, w2 = words[i], words[i + 1]
if len(w1) > len(w2) and w1[:len(w2)] == w2:
return "" # Invalid case: prefix issue
for ch1, ch2 in zip(w1, w2):
if ch1 != ch2:
graph[ch1].append(ch2)
break
# Step 2: DFS function
def dfs(ch: str) -> bool:
if visited[ch] == 1: # Cycle detected
return False
if visited[ch] == 2: # Already processed
return True
visited[ch] = 1 # Mark as visiting
for neighbour in graph[ch]:
if not dfs(neighbour):
return False
visited[ch] = 2 # Mark as fully processed
ans.append(ch)
return True
# Step 3: Perform DFS for all nodes
ans: List[str] = []
for char in visited.keys():
if visited[char] == 0:
if not dfs(char):
return ""
return "".join(reversed(ans))
Complexity
- ⏰ Time complexity:
O(N * L + V + E)
, whereN
is the number of words andL
is the average length of a word,V
is the number of unique characters andE
is the number of precedence edges.- Graph Construction: O(
N * L
) - DFS Traversal: O(
V + E
)`
- Graph Construction: O(
- 🧺 Space complexity:
O(
V + E)
for graph storage.O(V)
for Visited and recursion stack. OverallO(V+E)