Problem

Gist: Given a sorted dictionary (array of words) of an alien language, find order of characters in the language.

Detailed: There is a new alien language which uses the latin alphabet. However, the order among letters are unknown to you. You receive a list of non-empty words from the dictionary, where words are sorted lexicographically by the rules of this new language. Derive the order of letters in this language.

Examples

Example 1:

Input["wrt","wrf","er","ett","rftt"]
Output"wertf"
Explanation
from "wrt"and"wrf" ,we can get 't'<'f'
from "wrt"and"er" ,we can get 'w'<'e'
from "er"and"ett" ,we can get 'r'<'t'
from "ett"and"rftt" ,we can get 'e'<'r'
So return "wertf"

Example 2:

Input["z","x"]
Output"zx"
Explanation
from "z" and "x"we can get 'z' < 'x'
So return "zx"

Solution

Method 1 - Topological Sorting in Graph

To understand task better:

  1. First of all, we need to understand that order of words is very important, since as it is told in the task: words are sorted lexicographically by the rules of this new language.
  2. That means that in case [“wrt”, “ert”] “w” should come before “e”.
  3. Also, we need to make sure that length of the word can also be different.

So, we have to form the graph based on the chars in the list of words. Source will be char from word[i] and destination will be char from word [i+x] for eg. Then if there is cycle, we return “”. If not, we return topological sort of string. Also, if the graph is not strongly connected, we can have multiple solutions.

Also, normal DFS will not work. For eg. Look at this graph:

graph LR;
	
	A --> B & C
	B --> C
  

We can get 2 answers - ABC, ACB. But ACB is wrong, as C is dependant on B. Hence, need to do something call post order DFS. Look at Topological Sorting.

Approach

  1. Build the Graph:

    • Create a directed graph using adjacency lists to depict character precedence.
    • Add edges from the first differing character of adjacent words.
  2. DFS with Cycle Detection:

    • Use a visited map (or array) having three states:
      • 0: Not visited.
      • 1: Visited but still in the recursive stack (used to detect cycles).
      • 2: Fully processed.
    • If a node is already being visited (state 1), a cycle exists.
  3. Post-Order DFS for Topological Sort:

    • Process all dependencies of a character before adding it to the result stack.
    • Collect the characters in reverse topological order for proper sorting.

Code

Java
class Solution {
    public String alienOrder(String[] words) {
        // Step 1: Build the Graph
        Map<Character, List<Character>> graph = new HashMap<>();
        Map<Character, Integer> visited = new HashMap<>(); // 0 = not visited, 1 = visiting, 2 = processed
        for (String word : words) {
            for (char ch : word.toCharArray()) {
                if (!graph.containsKey(ch)) {
                    graph.put(ch, new ArrayList<>());
                    visited.put(ch, 0);
                }
            }
        }

        for (int i = 0; i < words.length - 1; i++) {
            String w1 = words[i], w2 = words[i + 1];
            if (w1.length() > w2.length() && w1.startsWith(w2)) {
                return ""; // Invalid case: prefix issue
            }
            
            for (int j = 0; j < Math.min(w1.length(), w2.length()); j++) {
                char ch1 = w1.charAt(j), ch2 = w2.charAt(j);
                if (ch1 != ch2) {
                    graph.get(ch1).add(ch2);
                    break;
                }
            }
        }

        // Step 2: DFS function
        List<Character> ans = new ArrayList<>();
        for (char ch : visited.keySet()) {
            if (visited.get(ch) == 0 && !dfs(ch, graph, visited, ans)) {
                return "";
            }
        }

        // Step 3: Return result in reverse order
        Collections.reverse(ans);
        StringBuilder res = new StringBuilder();
        for (char ch : ans) {
            res.append(ch);
        }
        return res.toString();
    }

    private boolean dfs(char ch, Map<Character, List<Character>> graph, Map<Character, Integer> visited, List<Character> ans) {
        if (visited.get(ch) == 1) { // Cycle detected
            return false;
        }
        if (visited.get(ch) == 2) { // Already processed
            return true;
        }

        visited.put(ch, 1); // Mark as visiting
        for (char neighbour : graph.getOrDefault(ch, new ArrayList<>())) {
            if (!dfs(neighbour, graph, visited, ans)) {
                return false;
            }
        }

        visited.put(ch, 2); // Mark as processed
        ans.add(ch);
        return true;
    }
}
Python
class Solution:
    def alienOrder(self, words: List[str]) -> str:
        # Step 1: Build the Graph
        graph: Dict[str, List[str]] = defaultdict(list)
        visited: Dict[str, int] = {}
        for word in words:
            for char in word:
                visited[char] = 0
        
        for i in range(len(words) - 1):
            w1, w2 = words[i], words[i + 1]
            if len(w1) > len(w2) and w1[:len(w2)] == w2:
                return ""  # Invalid case: prefix issue
            
            for ch1, ch2 in zip(w1, w2):
                if ch1 != ch2:
                    graph[ch1].append(ch2)
                    break
        
        # Step 2: DFS function
        def dfs(ch: str) -> bool:
            if visited[ch] == 1:  # Cycle detected
                return False
            if visited[ch] == 2:  # Already processed
                return True

            visited[ch] = 1  # Mark as visiting
            for neighbour in graph[ch]:
                if not dfs(neighbour):
                    return False

            visited[ch] = 2  # Mark as fully processed
            ans.append(ch)
            return True

        # Step 3: Perform DFS for all nodes
        ans: List[str] = []
        for char in visited.keys():
            if visited[char] == 0:
                if not dfs(char):
                    return ""

        return "".join(reversed(ans))

Complexity

  • ⏰ Time complexity: O(N * L + V + E), where N is the number of words and L is the average length of a word, V is the number of unique characters and E is the number of precedence edges.
    • Graph Construction: O(N * L)
    • DFS Traversal: O(V + E)`
  • 🧺 Space complexity: O(V + E) for graph storage. O(V) for Visited and recursion stack. Overall O(V+E)