Word Break Problem – Dynamic Programming
Word Break Problem: Given a string and a dictionary of words, determine if the string can be segmented into a space-separated sequence of one or more dictionary words.
For example,
dict[] = { this, th, is, famous, Word, break, b, r, e, a, k, br, bre, brea, ak, problem };
word = Wordbreakproblem
Output:
Word b r e a k problem
Word b r e ak problem
Word br e a k problem
Word br e ak problem
Word bre a k problem
Word bre ak problem
Word brea k problem
Word break problem
The idea is to use recursion to solve this problem. We consider all prefixes of the current string one by one and check if the current prefix is present in the dictionary or not. If the prefix is a valid word, add it to the output string and recur for the remaining string. The recursion’word base case is when the string becomes empty, and we print the output string.
Following is the C++, Java, and Python implementation of the idea:
C++
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
#include <iostream> #include <vector> #include <algorithm> using namespace std; // Function to segment a given string into a space-separated // sequence of one or more dictionary words void wordBreak(vector<string> const &dict, string word, string out) { // if the end of the string is reached, // print the output string if (word.size() == 0) { cout << out << endl; return; } for (int i = 1; i <= word.size(); i++) { // consider all prefixes of the current string string prefix = word.substr(0, i); // if the prefix is present in the dictionary, add it to the // output string and recur for the remaining string if (find(dict.begin(), dict.end(), prefix) != dict.end()) { wordBreak(dict, word.substr(i), out + " " + prefix); } } } // Word Break Problem Implementation in C++ int main() { // vector of strings to represent a dictionary // we can also use a Trie or a set to store a dictionary vector<string> dict = { "this", "th", "is", "famous", "Word", "break", "b", "r", "e", "a", "k", "br", "bre", "brea", "ak", "problem" }; // input string string word = "Wordbreakproblem"; wordBreak(dict, word, ""); return 0; } |
Java
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 |
import java.util.Arrays; import java.util.List; class Main { // Function to segment given string into a space-separated // sequence of one or more dictionary words public static void wordBreak(List<String> dict, String word, String out) { // if the end of the string is reached, // print the output string if (word.length() == 0) { System.out.println(out); return; } for (int i = 1; i <= word.length(); i++) { // consider all prefixes of the current string String prefix = word.substring(0, i); // if the prefix is present in the dictionary, add it to the // output string and recur for the remaining string if (dict.contains(prefix)) { wordBreak(dict, word.substring(i), out + " " + prefix); } } } // Word Break Problem Implementation in Java public static void main(String[] args) { // List of strings to represent a dictionary List<String> dict = Arrays.asList("this", "th", "is", "famous", "Word", "break", "b", "r", "e", "a", "k", "br", "bre", "brea", "ak", "problem"); // input string String word = "Wordbreakproblem"; wordBreak(dict, word, ""); } } |
Python
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
# Function to segment given string into a space-separated # sequence of one or more dictionary words def wordBreak(words, word, out=''): # if the end of the string is reached, # print the output string if not word: print(out) return for i in range(1, len(word) + 1): # consider all prefixes of the current string prefix = word[:i] # if the prefix is present in the dictionary, add it to the # output string and recur for the remaining string if prefix in words: wordBreak(words, word[i:], out + ' ' + prefix) # Word Break Problem Implementation in Python if __name__ == '__main__': # List of strings to represent a dictionary words = [ 'self', 'th', 'is', 'famous', 'Word', 'break', 'b', 'r', 'e', 'a', 'k', 'br', 'bre', 'brea', 'ak', 'problem' ] # input string word = 'Wordbreakproblem' wordBreak(words, word) |
Word b r e a k problem
Word b r e ak problem
Word br e a k problem
Word br e ak problem
Word bre a k problem
Word bre ak problem
Word brea k problem
Word break problem
There is a very famous alternate version of the above problem in which we only have to determine if a string can be segmented into a space-separated sequence of one or more dictionary words or not, and not actually print all sequences. This version is demonstrated below in C++, Java, and Python:
C++
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 |
#include <iostream> #include <vector> #include <algorithm> using namespace std; // Function to determine if a string can be segmented into space-separated // sequence of one or more dictionary words bool wordBreak(vector<string> const &dict, string word) { // return true if the end of the string is reached if (word.size() == 0) { return true; } for (int i = 1; i <= word.size(); i++) { // consider all prefixes of the current string string prefix = word.substr(0, i); // return true if the prefix is present in the dictionary and the remaining // string also forms a space-separated sequence of one or more // dictionary words if (find(dict.begin(), dict.end(), prefix) != dict.end() && wordBreak(dict, word.substr(i))) { return true; } } // return false if the string can't be segmented return false; } // Word Break Problem Implementation in C++ int main() { // vector of strings to represent a dictionary // we can also use a Trie or a set to store a dictionary vector<string> dict = { "this", "th", "is", "famous", "Word", "break", "b", "r", "e", "a", "k", "br", "bre", "brea", "ak", "problem" }; // input string string word = "Wordbreakproblem"; if (wordBreak(dict, word)) { cout << "The string can be segmented"; } else { cout << "The string can't be segmented"; } return 0; } |
Output:
The string can be segmented
Java
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
import java.util.Arrays; import java.util.List; class Main { // Function to determine if a string can be segmented into a // space-separated sequence of one or more dictionary words public static boolean wordBreak(List<String> dict, String word) { // return true if the end of the string is reached, if (word.length() == 0) { return true; } for (int i = 1; i <= word.length(); i++) { // consider all prefixes of the current string String prefix = word.substring(0, i); // return true if the prefix is present in the dictionary and the // remaining string also forms a space-separated sequence of // one or more dictionary words if (dict.contains(prefix) && wordBreak(dict, word.substring(i))) { return true; } } // return false if the string can't be segmented return false; } // Word Break Problem Implementation in Java public static void main(String[] args) { // List of strings to represent a dictionary List<String> dict = Arrays.asList("this", "th", "is", "famous", "Word", "break", "b", "r", "e", "a", "k", "br", "bre", "brea", "ak", "problem"); // input string String word = "Wordbreakproblem"; if (wordBreak(dict, word)) { System.out.println("The string can be segmented"); } else { System.out.println("The string can't be segmented"); } } } |
Output:
The string can be segmented
Python
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 |
# Function to determine if a string can be segmented into space-separated # sequence of one or more dictionary words def wordBreak(words, word): # return true if the end of the string is reached, if not word: return True for i in range(1, len(word) + 1): # consider all prefixes of the current string prefix = word[:i] # return true if the prefix is present in the dictionary and the remaining # string also forms a space-separated sequence of one or more # dictionary words if prefix in words and wordBreak(words, word[i:]): return True # return false if the string can't be segmented return False # Word Break Problem Implementation in Python if __name__ == '__main__': # List of strings to represent a dictionary words = [ 'self', 'th', 'is', 'famous', 'Word', 'break', 'b', 'r', 'e', 'a', 'k', 'br', 'bre', 'brea', 'ak', 'problem' ] # input string word = 'Wordbreakproblem' if wordBreak(words, word): print('The string can be segmented') else: print('The string can\'t be segmented') |
Output:
The string can be segmented
The time complexity of the above solution is exponential and occupies space in the call stack.
The word-break problem has optimal substructure. We have seen that the problem can be broken down into smaller subproblem, which can further be broken down into yet smaller subproblem, and so on. The word-break problem also exhibits overlapping subproblems, so we will end up solving the same subproblem over and over again. If we draw the recursion tree, we can see that the same subproblems are getting computed repeatedly.
The problems having optimal substructure and overlapping subproblem can be solved by dynamic programming, in which subproblem solutions are memoized rather than computed repeatedly. This method is demonstrated below in C++, Java, and Python:
C++
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 |
#include <iostream> #include <string> #include <unordered_set> #include <algorithm> using namespace std; // Function to determine if a string can be segmented into space-separated // sequence of one or more dictionary words bool wordBreak(unordered_set<string> const &dict, string word, vector<int> &lookup) { // `n` stores length of the current substring int n = word.size(); // return true if the end of the string is reached if (n == 0) { return true; } // if the subproblem is seen for the first time if (lookup[n] == -1) { // mark subproblem as seen (0 initially assuming string // can't be segmented) lookup[n] = 0; for (int i = 1; i <= n; i++) { // consider all prefixes of the current string string prefix = word.substr(0, i); // if the prefix is found in the dictionary, then recur for the suffix if (find(dict.begin(), dict.end(), prefix) != dict.end() && wordBreak(dict, word.substr(i), lookup)) { // return true if the string can be segmented return lookup[n] = 1; } } } // return solution to the current subproblem return lookup[n]; } // Word Break Problem Implementation in C++ int main() { // set of strings to represent a dictionary // we can also use a Trie or a vector to store a dictionary unordered_set<string> dict = { "this", "th", "is", "famous", "Word", "break", "b", "r", "e", "a", "k", "br", "bre", "brea", "ak", "problem" }; // input string string word = "Wordbreakproblem"; // lookup array to store solutions to subproblems // lookup[i] stores if substring word[n-i…n) can be segmented or not vector<int> lookup(word.length() + 1, -1); if (wordBreak(dict, word, lookup)) { cout << "The string can be segmented"; } else { cout << "The string can't be segmented"; } return 0; } |
Output:
The string can be segmented
Java
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 |
import java.util.Arrays; import java.util.Set; import java.util.stream.Collectors; import java.util.stream.Stream; class Main { // Function to determine if a string can be segmented into // space-separated sequence of one or more dictionary words public static boolean wordBreak(Set<String> dict, String word, int[] lookup) { // `n` stores length of the current substring int n = word.length(); // return true if the end of the string is reached if (n == 0) { return true; } // if the subproblem is seen for the first time if (lookup[n] == -1) { // mark subproblem as seen (0 initially assuming string // can't be segmented) lookup[n] = 0; for (int i = 1; i <= n; i++) { // consider all prefixes of the current string String prefix = word.substring(0, i); // if the prefix is found in the dictionary, then recur for the suffix if (dict.contains(prefix) && wordBreak(dict, word.substring(i), lookup)) { // return true if the string can be segmented lookup[n] = 1; return true; } } } // return solution to the current subproblem return lookup[n] == 1; } // Word Break Problem Implementation in Java public static void main(String[] args) { // Set of strings to represent a dictionary // we can also use a Trie or a List to store a dictionary Set<String> dict = Stream.of("this", "th", "is", "famous", "Word", "break", "b", "r", "e", "a", "k", "br", "bre", "brea", "ak", "problem") .collect(Collectors.toSet()); // input string String word = "Wordbreakproblem"; // lookup array to store solutions to subproblems // lookup[i] stores if substring word[n-i…n) can be segmented or not int[] lookup = new int[word.length() + 1]; Arrays.fill(lookup, -1); if (wordBreak(dict, word, lookup)) { System.out.println("The string can be segmented"); } else { System.out.println("The string can't be segmented"); } } } |
Output:
The string can be segmented
Python
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 |
# Function to determine if a string can be segmented into space-separated # sequence of one or more dictionary words def wordBreak(words, word, lookup): # `n` stores length of the current substring n = len(word) # return true if the end of the string is reached if n == 0: return True # if the subproblem is seen for the first time if lookup[n] == -1: # mark subproblem as seen (0 initially assuming string # can't be segmented) lookup[n] = 0 for i in range(1, n + 1): # consider all prefixes of the current string prefix = word[:i] # if the prefix is found in the dictionary, then recur for the suffix if prefix in words and wordBreak(words, word[i:], lookup): # return true if the string can be segmented lookup[n] = 1 return True # return solution to the current subproblem return lookup[n] == 1 # Word Break Problem Implementation in Python if __name__ == '__main__': # Set of strings to represent a dictionary # we can also use a Trie or a List to store a dictionary words = { 'self', 'th', 'is', 'famous', 'Word', 'break', 'b', 'r', 'e', 'a', 'k', 'br', 'bre', 'brea', 'ak', 'problem' } # input string word = 'Wordbreakproblem' # lookup table to store solutions to subproblems # lookup[i] stores if substring word[n-i…n) can be segmented or not lookup = [-1] * (len(word) + 1) if wordBreak(words, word, lookup): print('The string can be segmented') else: print('The string can\'t be segmented') |
Output:
The string can be segmented
The time complexity of the above solution is O(n2) and requires O(n) extra space, where n is the length of the input string.
Exercise: Implement a bottom-up version of the above solution.
Thanks for reading.
To share your code in the comments, please use our online compiler that supports C, C++, Java, Python, JavaScript, C#, PHP, and many more popular programming languages.
Like us? Refer us to your friends and support our growth. Happy coding :)