The Shortest Common Supersequence (SCS) is finding the shortest supersequence Z of given sequences X and Y such that both X and Y are subsequences of Z.

For example,

Consider the two following sequences, X and Y:
X: ABCBDAB
Y: BDCABA
 
The length of the SCS is 9
SCS are ABCBDCABA, ABDCABDAB, and ABDCBDABA

Practice this problem

1. Finding the Shortest Common Supersequence

The idea is to fill the lookup table of SCS by finding the length of SCS of given sequences, X[0…m-1] and Y[0…n-1], in a bottom-up manner and then recursively find the SCS in a top-down manner using values in the lookup table, i.e., follow the path from the bottom-right corner of lookup table to its top-left corner when reading out an SCS.

  • If the current characters of X and Y are equal, they are part of the SCS, and move diagonally in the lookup table (i.e., recur to find SCS of sequence X[0…m-2], Y[0…n-2]).
  • If the current characters of X and Y are different, go up or left, depending on which cell has a lower number.
    1. If a top cell of the current cell has less value than the left cell, include the current character of sequence X in SCS and recur to find SCS of sequence X[0…m-2], Y[0…n-1].
    2. If a left cell of the current cell has less value than the top cell, include the current character of sequence Y in SCS and recur to find SCS of sequence X[0…m-1], Y[0…n-2].
  • If the end of either sequence is reached, then the other sequence forms part of the SCS.

Following is the implementation of the above approach in C++, Java, and Python:

C++


Download  Run Code

Output:

The shortest common supersequence of ABCBDAB and BDCABA is ABCBDCABA

Java


Download  Run Code

Output:

The shortest common supersequence of ABCBDAB and BDCABA is ABCBDCABA

Python


Download  Run Code

Output:

The shortest common supersequence of ABCBDAB and BDCABA is ABCBDCABA

The time complexity of the above solution is O(m.n) and requires O(m.n) extra space, where m is the length of the first string and n is the length of the second string.

2. Finding All Shortest Common Supersequence

The shortest common supersequence for two strings is not unique, and the above code doesn’t generate all SCS of the given sequences. It will always print one SCS. The problem with the above code is when the last characters of X and Y are different, it either goes up or left in the lookup table depending upon which side has a smaller value.

How can we modify the code, so it prints all possible SCS?

The idea remains the same except if the top cell is equal to the left cell, consider both top and left directions; otherwise, go in the direction of smaller value.

In other words, if the last characters of X and Y are not the same, then SCS can be constructed from either the top side or from the left side of the lookup table, depending upon which side has a smaller value. If both the values are equal, then SCS can be constructed from both directions in the lookup table. So, we go in the direction of a smaller value or go in both directions if the values are equal.

Following is the C++, Java, and Python implementation based on the above idea:

C++


Download  Run Code

Output:

The shortest common supersequence of ABCBDAB and BDCABA are:
 
ABCBDCABA
ABDCABDAB
ABDCBDABA

Java


Download  Run Code

Output:

The shortest common supersequence of ABCBDAB and BDCABA are: [ABDCBDABA, ABDCABDAB, ABCBDCABA]

Python


Download  Run Code

Output:

The shortest common supersequence of ABCBDAB and BDCABA are: {‘ABCBDCABA’, ‘ABDCBDABA’, ‘ABDCABDAB’}