Word Break II: Sentence Segmentation
Given a string and a dictionary of words, the goal is to find all possible ways to segment the string into a sequence of words from the dictionary, forming valid sentences. This problem is a classic example of dynamic programming or recursion with memoization, frequently encountered in natural language processing and text analysis tasks where understanding word boundaries is crucial.
Problem Description
You are given a non-empty string s and a dictionary of strings wordDict. Your task is to determine all possible sentences that can be formed by breaking the string s into a space-separated sequence of one or more dictionary words. Each word in the dictionary may be used multiple times.
Key Requirements:
- The entire string
smust be segmented. - Each segment must be a valid word present in
wordDict. - The output should be a list of all possible valid sentences.
Expected Behavior:
- For each valid segmentation, construct a sentence by joining the words with spaces.
- The order of words in a sentence matters.
Edge Cases to Consider:
- The input string
smight not be segmentable at all. In this case, return an empty list. - The
wordDictcould be empty. - The
wordDictmight contain very long words or duplicate words.
Examples
Example 1:
Input:
s = "catsanddog"
wordDict = ["cat", "cats", "and", "sand", "dog"]
Output:
["cats and dog", "cat sand dog"]
Explanation:
"catsanddog" can be segmented as "cats and dog" (using "cats", "and", "dog") or "cat sand dog" (using "cat", "sand", "dog").
Example 2:
Input:
s = "pineapplepenapple"
wordDict = ["apple", "pen", "applepen", "pine", "pineapple"]
Output:
["pine apple pen apple", "pineapple pen apple", "pine applepen apple"]
Explanation:
The string can be segmented in three ways:
1. "pine" + "apple" + "pen" + "apple"
2. "pineapple" + "pen" + "apple"
3. "pine" + "applepen" + "apple"
Example 3:
Input:
s = "catsandog"
wordDict = ["cats", "dog", "sand", "and", "cat"]
Output:
[]
Explanation:
The string "catsandog" cannot be segmented into any sequence of words from the given dictionary.
Constraints
1 <= s.length <= 201 <= wordDict.length <= 10001 <= wordDict[i].length <= 20sandwordDict[i]consist of only lowercase English letters.- All strings in
wordDictare unique.
Notes
- Consider using recursion with memoization (dynamic programming) to efficiently solve this problem.
- A good approach involves breaking down the problem into smaller subproblems: can the suffix of the string be segmented?
- You might find it helpful to first determine if a string can be segmented at all (similar to the Word Break I problem) before attempting to generate all segmentations. This can act as a pruning step.