View on NeetCode
View on LeetCode

Problem

Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment.

Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

Example 1:

Input: root = [1,2,3,null,null,4,5]
Output: [1,2,3,null,null,4,5]

Example 2:

Input: root = []
Output: []

Constraints:

  • The number of nodes in the tree is in the range [0, 10^4].
  • -1000 <= Node.val <= 1000

Solution

Approach: Preorder DFS with Delimiters

The key insight is to use preorder traversal for serialization (easy to reconstruct) and use a delimiter to separate values and markers for null nodes.

Implementation

class Codec:
    def serialize(self, root):
        """Encodes a tree to a single string."""
        def dfs(node):
            if not node:
                result.append("null")
                return
            result.append(str(node.val))
            dfs(node.left)
            dfs(node.right)

        result = []
        dfs(root)
        return ",".join(result)

    def deserialize(self, data):
        """Decodes your encoded data to tree."""
        def dfs():
            val = next(values)
            if val == "null":
                return None
            node = TreeNode(int(val))
            node.left = dfs()
            node.right = dfs()
            return node

        values = iter(data.split(","))
        return dfs()

Alternative (BFS with Level Order):

from collections import deque

class Codec:
    def serialize(self, root):
        if not root:
            return ""

        result = []
        queue = deque([root])

        while queue:
            node = queue.popleft()
            if node:
                result.append(str(node.val))
                queue.append(node.left)
                queue.append(node.right)
            else:
                result.append("null")

        return ",".join(result)

    def deserialize(self, data):
        if not data:
            return None

        values = data.split(",")
        root = TreeNode(int(values[0]))
        queue = deque([root])
        i = 1

        while queue:
            node = queue.popleft()

            if i < len(values) and values[i] != "null":
                node.left = TreeNode(int(values[i]))
                queue.append(node.left)
            i += 1

            if i < len(values) and values[i] != "null":
                node.right = TreeNode(int(values[i]))
                queue.append(node.right)
            i += 1

        return root

Complexity Analysis

  • Time Complexity: O(n) for both serialize and deserialize, where n is the number of nodes.
  • Space Complexity: O(n) for storing the serialized string and recursion/queue.

Key Insights

  1. Preorder for DFS: Preorder traversal (root, left, right) makes reconstruction straightforward - we know the first value is the root.

  2. Null Markers: We must explicitly mark null nodes so we know when subtrees are empty during deserialization.

  3. Iterator Pattern: Using an iterator for deserialization allows us to consume values sequentially without tracking indices.

  4. Delimiter Choice: Comma is a good delimiter since node values are integers, not strings containing commas.

  5. DFS vs BFS: DFS is simpler and more elegant. BFS produces level-order serialization which may be more intuitive but requires more bookkeeping during deserialization.