Problem

Given the root of a binary tree, return all duplicate subtrees.

For each kind of duplicate subtrees, you only need to return the root node of any one of them.

Two trees are duplicate if they have the same structure with the same node values.

Examples

Example 1:

graph TD
    A(1)
    A(1) --- B(2):::duplicate
    A --- C(3)
    B --- D(4):::duplicate
    B ~~~ N1:::hidden
    C --- E(2):::duplicate
    C --- F(4)
    E --- G(4):::duplicate
    E ~~~ N2:::hidden

classDef duplicate fill:#FFD700,stroke:#000,stroke-width:1px,color:#000;
classDef hidden display:none
  
1
2
Input: root = [1,2,3,4,null,2,4,null,null,4]
Output: [[2,4],[4]]

Example 2:

graph TD
    A(2)
    A(2) --- B(1):::duplicate
    A(2) --- C(1):::duplicate

classDef duplicate fill:#FFD700,stroke:#000,stroke-width:1px,color:#000;
  
1
2
Input: root = [2,1,1]
Output: [[1]]

Example 3:

graph TD
    A(2)
    A --- B(2):::duplicate
    A --- C(2):::duplicate
    B --- D(3):::duplicate
    B ~~~ N1:::hidden
    C --- E(3):::duplicate
    C ~~~ N2:::hidden

classDef duplicate fill:#FFD700,stroke:#000,stroke-width:1px,color:#000;
classDef hidden display:none
  
1
2
Input: root = [2,2,2,3,null,3,null]
Output: [[2,3],[3]]

Solution

Method 1 - Using Serialization and Postorder Traversal

To solve the problem, the key is to uniquely identify each subtree using a serialization (string representation). Once we can serialize a subtree, we can check for duplicates. If we encounter the same serialization more than once, it means the subtree is a duplicate.

The approach involves:

  1. Traversing the tree using post-order traversal (left, right, root), as this ensures that every subtree is serialized only after its children are serialized.
  2. Using a map/dictionary to record the frequency of each subtree’s serialization.
  3. Returning the root of the subtrees that appear more than once.

Approach

  1. Post-order Traversal:
    • Serialise each subtree based on its structure and node values.
    • Include information from both children and the current node.
  2. HashMap in Java / Dictionary in Python:
    • Store the frequency of each resulting serialization.
    • Only save the root node of duplicate subtrees if the frequency is exactly 2.
  3. Result List:
    • Maintain a list to store the nodes that represent duplicate subtrees.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
class Solution {
    public List<TreeNode> findDuplicateSubtrees(TreeNode root) {
        Map<String, Integer> count = new HashMap<>();
        List<TreeNode> ans = new ArrayList<>();
        serialize(root, count, ans);
        return ans;
    }
    
    private String serialize(TreeNode node, Map<String, Integer> count, List<TreeNode> ans) {
        if (node == null) return "#";
        String serial = serialize(node.left, count, ans) + "," + serialize(node.right, count, ans) + "," + node.val;
        count.put(serial, count.getOrDefault(serial, 0) + 1);
        if (count.get(serial) == 2) ans.add(node);
        return serial;
    }
}
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
class Solution:
    def findDuplicateSubtrees(self, root: Optional[TreeNode]) -> List[TreeNode]:
        count: dict[str, int] = defaultdict(int)
        ans: list[TreeNode] = []
        self.serialize(root, count, ans)
        return ans
    
    def serialize(self, node: Optional[TreeNode], count: dict[str, int], ans: list[TreeNode]) -> str:
        if not node:
            return "#"
        serial = f"{self.serialize(node.left, count, ans)},{self.serialize(node.right, count, ans)},{node.val}"
        count[serial] += 1
        if count[serial] == 2:
            ans.append(node)
        return serial

Complexity

  • ⏰ Time complexity: O(n). Each node is visited once during traversal, and the serialization operation is constant for each node.
  • 🧺 Space complexity: O(n). The space used is mainly due to the HashMap/Dictionary to store the serialization of subtrees and recursion stack during traversal.