Encode and Decode Strings

Problem

Design an algorithm to encode a list of strings to a single string. The encoded string is then decoded back to the original list of strings.

Please implement encode and decode.

Example 1:

Input: ["neet","code","love","you"]
Output: ["neet","code","love","you"]
Explanation: The strings are encoded and then decoded back to the original list.

Example 2:

Input: ["we","say",":","yes"]
Output: ["we","say",":","yes"]

Constraints:

0 <= strs.length < 200
0 <= strs[i].length < 200
strs[i] contains any possible characters out of 256 valid ASCII characters.

Follow up: Could you write a generalized algorithm to work on any possible set of characters?

Solution

Approach: Length-Prefixed Encoding

The key insight is to prefix each string with its length followed by a delimiter. This way, we know exactly how many characters to read for each string, even if the strings contain special characters or delimiters.

The format is: length#string for each string. For example, ["neet","code"] becomes "4#neet4#code".

This approach handles all edge cases:

Empty strings (encoded as "0#")
Strings with special characters (including #)
Any ASCII characters

Implementation

class Solution:
    def encode(self, strs: List[str]) -> str:
        result = ""
        for s in strs:
            result += str(len(s)) + "#" + s
        return result

    def decode(self, s: str) -> List[str]:
        result = []
        i = 0

        while i < len(s):
            # Find the delimiter '#'
            j = i
            while s[j] != '#':
                j += 1

            # Extract length
            length = int(s[i:j])

            # Extract string of that length
            i = j + 1  # Start after '#'
            result.append(s[i:i + length])

            # Move to next encoded string
            i += length

        return result

Alternative (Using Join for Encoding):

class Solution:
    def encode(self, strs: List[str]) -> str:
        return ''.join(f"{len(s)}#{s}" for s in strs)

    def decode(self, s: str) -> List[str]:
        result = []
        i = 0

        while i < len(s):
            # Find the delimiter '#'
            delimiter_pos = s.index('#', i)

            # Extract length
            length = int(s[i:delimiter_pos])

            # Extract string of that length
            start = delimiter_pos + 1
            result.append(s[start:start + length])

            # Move to next encoded string
            i = start + length

        return result

Complexity Analysis

Time Complexity: O(n) for both encode and decode, where n is the total number of characters across all strings. We process each character once.
Space Complexity: O(1) extra space (not counting the output). We build the result as we go without using additional data structures proportional to input size.

Key Insights

Length Prefix is Key: By encoding the length before each string, we know exactly how many characters to read, eliminating ambiguity even with special characters.
Delimiter Choice: We use # as a delimiter between length and string. Since we know the exact length, it doesn’t matter if # appears in the string itself.
Handles Edge Cases: This approach naturally handles:
- Empty strings: encoded as "0#"
- Single character strings: encoded as "1#x"
- Strings with numbers, delimiters, or any ASCII characters
No Escaping Needed: Unlike approaches that use a single delimiter (like ","), we don’t need to escape special characters in the strings.
Generalized Solution: This algorithm works for any possible set of characters, satisfying the follow-up requirement. The length-prefix approach is language-agnostic and robust.