652. Find Duplicate Subtrees - Detailed Explanation

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Problem Statement

Given the root of a binary tree, return all duplicate subtrees. For each kind of duplicate subtree, only return the root node of one of them. Two trees are considered duplicates if they have the same structure and identical node values.

Example Input & Output 1

  • Input:
    Tree represented as:
           1
          / \
         2   3
        /   / \
       4   2   4
          /
         4
    
  • Output: A list containing the roots of the duplicate subtrees.
  • Explanation:
    • There are two duplicate subtrees: one is the subtree rooted at node 4 (which appears three times) and the other is the subtree:
         2
        /
       4
      
      (appearing twice).

Example Input & Output 2

  • Input:
    Tree represented as:
          2
         / \
        1   1
    
  • Output: A list containing the root node of the subtree with value 1.
  • Explanation:
    • The duplicate subtree is a single node with value 1.

Example Input & Output 3

  • Input:
    Tree represented as a single node:
        1
    
  • Output: An empty list
  • Explanation:
    • There are no duplicate subtrees since the tree consists of only one node.

Constraints:

  • The number of nodes in the tree is in the range [1, 5000].
  • -200 ≤ Node.val ≤ 200

Hints Before the Solution

  • Hint:
    Think about how you can uniquely represent the structure and values of any subtree. One common idea is to perform a postorder traversal and serialize each subtree (using a string or tuple) that includes markers for null children. This helps in identifying when two subtrees are exactly the same.

  • Hint:
    Use a hash map (or dictionary) to count how many times you have seen each serialized subtree. When the count for a specific serialization reaches 2, record the root of that subtree in your result list.

Approaches

Brute Force Approach

  • Idea:
    For every node, compare the subtree rooted at that node with every other subtree in the tree.
  • Why It Fails:
    This method would require comparing every pair of subtrees, leading to an exponential number of comparisons and a time complexity that is not feasible for trees with thousands of nodes.

Optimal Approach: Serialization with Postorder Traversal

  • Precomputation of Subtree Representations:
    Use a postorder traversal to serialize every subtree. For each node, generate a string (or tuple) representation that uniquely describes its subtree. For example, the serialization can be formed as:

    serial = node.val + "," + left_serial + "," + right_serial
    

    Use a special marker (like #) to denote null children.

  • Using a Hash Map to Count Occurrences:
    Maintain a dictionary where the key is the serialized subtree and the value is the frequency. When the frequency becomes 2 (i.e., the duplicate is found for the first time), add that node to the answer list.

  • Result:
    After processing the whole tree, the list of recorded nodes will represent all the roots of duplicate subtrees.

Python Code

Python3
Python3

. . . .

Java Code

Java
Java

. . . .

Complexity Analysis

  • Time Complexity:
    The algorithm visits every node once during the postorder traversal. Serialization of each node takes O(k) time, where k is the number of nodes in that subtree. In the worst-case scenario (for example, in a skewed tree), the overall time complexity can reach O(n²). However, for balanced trees, it is close to O(n).

  • Space Complexity:
    Extra space is used by the recursion stack (O(n) in the worst case) and by the hash map to store the serializations, which also takes O(n) space.

Step-by-Step Walkthrough with Visual Examples

  • Tree Construction and Serialization:
    For a given node, first serialize its left subtree, then its right subtree, and finally combine these with the node’s value to form the serialization string.
    For example, for the node with value 2 and left child 4, the serialization might look like:

    "2,4,#,#,#"
    

    Here, 4,#,# represents the left subtree (node 4 with no children) and the trailing # denotes the absence of a right child.

  • Recording Duplicates:
    As you build the serialization for each node, update the count in your hash map. When the count becomes 2, add the node to the results. This ensures that you report each duplicate subtree only once.

  • Result Compilation:
    After processing all nodes, the result list will contain the roots of all duplicate subtrees.

Common Mistakes

  • Missing Null Markers:
    Failing to include a marker for null nodes in the serialization can lead to different structures having the same serialization string.

  • Incorrect String Construction:
    Not using delimiters (like commas) when combining node values can cause collisions between different subtrees.

  • Overcounting Duplicates:
    Adding a subtree root to the result list every time you encounter it instead of only the first duplicate (when count equals 2) may lead to duplicate entries in the final answer.

Edge Cases

  • Single Node Tree:
    When the tree contains only one node, there are no duplicates; the result should be an empty list.

  • No Duplicates:
    If every subtree in the tree is unique, ensure that your algorithm returns an empty list.

  • All Nodes are Identical:
    In a tree where every node has the same value and structure, the serialization method should correctly identify the duplicate subtrees without overcounting.

TAGS
leetcode
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
Does Palantir pay more than Google?
How does a tech resume look?
What do I use for a max-heap implementation in Python?
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Image
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
Image
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.
;