Twin Song: Shortest Common Supersequence Length
๐ฆ Twin Song: Shortest Common Supersequence Length
Difficulty: Advanced Tags: dp, strings, dynamic-programming, lcs, subsequence Series: Algorithms I
Problem
The Twin Song finds the shortest melody that contains both unicorn songs. Find the length of the shortest string that contains both given strings as subsequences.
A supersequence is a string that contains both input strings as subsequences (not necessarily contiguous). Find the shortest such supersequence and return its length.
Subsequence Definition: - Characters appear in same order as original - Characters need not be consecutive - Example: "ace" is subsequence of "abcde"
Shortest Common Supersequence (SCS):
- Contains string a as subsequence
- Contains string b as subsequence
- Has minimum possible length
Real-World Application
SCS algorithms are used in: - DNA sequence analysis - finding minimal sequences containing genetic markers - Text diff tools - computing minimal edit sequences (diff/patch) - Version control - merging file changes (git, svn) - Data compression - finding common patterns in data streams - Bioinformatics - sequence alignment and assembly - File synchronization - rsync, cloud storage sync - Spell checkers - finding minimal corrections - Plagiarism detection - identifying shared content
Input
data = {
'a': str, # First string
'b': str # Second string
}
Output
int # Length of shortest common supersequence
Constraints
- 1 โค len(a), len(b) โค 2,000
- Strings contain lowercase letters
- Empty string is valid input
Examples
Example 1: Basic SCS
Input: {'a': 'abac', 'b': 'cab'}
Output: 5
Explanation:
- One shortest common supersequence is "cabac"
- Check a = "abac" is subsequence: ca-b-a-c โ
- Check b = "cab" is subsequence: c-a-b-ac โ
- Length: 5
Example 2: Empty String
Input: {'a': '', 'b': 'abc'}
Output: 3
Explanation: - If one string is empty, SCS is the other string - Result: "abc" (length 3)
Example 3: Identical Strings
Input: {'a': 'abc', 'b': 'abc'}
Output: 3
Explanation: - If strings are identical, SCS is the string itself - Result: "abc" (length 3)
Example 4: No Common Characters
Input: {'a': 'abc', 'b': 'def'}
Output: 6
Explanation: - No common characters, must include all from both - One SCS: "abcdef" (length 6)
Example 5: One is Subsequence of Other
Input: {'a': 'abc', 'b': 'aec'}
Output: 4
Explanation: - SCS: "abec" contains both - abc โ abec โ - a-e-c โ abec โ
What You'll Learn
- Dynamic programming on strings
- Longest Common Subsequence (LCS) relationship
- 2D DP table construction
- Sequence alignment algorithms
- Space optimization techniques
Why This Matters
SCS is a fundamental DP problem that: - Demonstrates DP on two sequences - Builds on LCS (simpler problem) - Has practical applications in bioinformatics - Tests understanding of subsequences vs substrings
Companies that ask this: Google, Amazon, Microsoft (rare but important)
Starter Code
def challenge_function(data):
"""
Find the length of the shortest common supersequence.
A supersequence is a string that contains both input strings as
subsequences. Find the shortest such string and return its length.
Args:
data: dict with keys:
- 'a' (str): first string
- 'b' (str): second string
Returns:
int: length of shortest common supersequence
"""
# Your implementation here
pass