The Longest Repeating Subsequence (LRS) problem is finding the longest subsequences of a string that occurs at least twice.

The problem differs from the problem of finding the longest repeating substring. Unlike substrings, subsequences are not required to occupy consecutive positions within the original string.

 
For example, consider the sequence ATACTCGGA.

The length of the longest repeating subsequence is 4
The longest repeating subsequence is ATCG
 
A T A C T C G G A
A T A C T C G G A
 
Note that repeated characters holds a different index in the input string.

Practice this problem

The longest repeating subsequence problem is a classic variation of the Longest Common Subsequence (LCS) problem. The idea is to find the LCS of the given string with itself, i.e., call LCS(X, X) and exclude the cases when indexes are the same (i = j) since repeated characters hold a different index in the input string.

The LRS problem has optimal substructure. We can recursively define the problem as:

            | 0                                    (if i = 0 or j = 0)
LRS[i][j] = | LRS[i-1][j-1] + 1                    (if X[i-1] = X[j-1] & i!=j)
            | max (LRS[i-1][j], LRS[i][j-1])       (if X[i-1] != X[j-1])

The following implementation in C++, Java, and Python recursively finds the length of the longest repeated subsequence of a sequence using the optimal substructure property of the LRS problem:

C++


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

Java


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

Python


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

The worst-case time complexity of the above solution is O(2n) and occupies space in the call stack, where n is the length of the input string. The worst case happens when there is no repeated character present in X (i.e., LRS length is 0), and each recursive call will end up in two recursive calls.
 

 
The LRS problem also exhibits overlapping subproblems. Let’s consider the recursion tree for a sequence of length 5 having all distinct characters whose LRS length is 0.

Longest Repeated subsequence

 
As we can see above, the same subproblems (highlighted in the same color) are getting computed repeatedly. Since both optimal substructure and overlapping subproblems properties of dynamic programming are satisfied, we can save subproblem solutions in memory rather than computed them repeatedly.

The algorithm can be implemented as follows in C++, Java, and Python:

C++


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

Java


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

Python


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

The time complexity of the above top-down solution is O(n2) and requires O(n2) extra space, where n is the length of the input string.
 

 
The above memoized version follows the top-down approach since we first break the problem into subproblems and then calculate and store values. We can also solve this problem in a bottom-up manner by calculating the smaller values of LRS(i, j) first and then building larger values from them. This is demonstrated below in C++, Java, and Python:

C++


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

Java


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

Python


Download  Run Code

Output:

The length of the longest repeating subsequence is 4

The time complexity of the above bottom-up solution is O(n2) and requires O(n2) extra space, where n is the length of the input string. The space complexity of the above solution can be improved to O(n) as calculating the LRS of a row of the LRS table requires only the solutions to the current row and the previous row.

How to extend the above solution for printing the longest repeating subsequence?

The idea is very similar to printing the Longest Common Subsequence (LCS) of two strings. Refer to this post for details.

This approach is demonstrated below in C++, Java, and Python:

C++


Download  Run Code

Output:

The longest repeating subsequence is ATCG

Java


Download  Run Code

Output:

The longest repeating subsequence is ATCG

Python


Download  Run Code

Output:

The longest repeating subsequence is ATCG

Exercise: Write space optimized code for iterative version.