Implementing a Suffix Tree in JavaScript
Suffix trees are powerful data structures used for efficient string searching and pattern matching. They represent all suffixes of a string in a tree-like structure, allowing for rapid identification of occurrences of patterns within the original string. This challenge asks you to implement a suffix tree in JavaScript, providing a foundation for advanced text processing applications.
Problem Description
You are tasked with creating a JavaScript class called SuffixTree that represents a suffix tree for a given string. The class should support the following functionalities:
- Construction: The
SuffixTreeclass should be initialized with a string. During initialization, it should construct the suffix tree for that string. - Contains: A method
contains(pattern)that takes a stringpatternas input and returnstrueif thepatternis a substring of the original string, andfalseotherwise. This should be determined by traversing the suffix tree. - CountOccurrences: A method
countOccurrences(pattern)that takes a stringpatternas input and returns the number of times thepatternappears as a substring in the original string. This also requires traversing the suffix tree.
The suffix tree should be implemented using nodes with children (a map/object where keys are characters and values are child nodes) and an optional suffixIndex property for leaf nodes, indicating the starting index of the suffix in the original string. Internal nodes do not have a suffixIndex.
Key Requirements:
- The implementation should be reasonably efficient. While a highly optimized implementation is not required, avoid excessively inefficient algorithms.
- The code should be well-structured and readable.
- The
containsmethod should correctly identify whether a pattern exists within the original string. - The
countOccurrencesmethod should accurately count the number of occurrences of a pattern.
Expected Behavior:
- The constructor should build the suffix tree without errors.
containsshould returntruefor patterns that are substrings andfalseotherwise.countOccurrencesshould return the correct number of occurrences.
Edge Cases to Consider:
- Empty input string.
- Empty pattern string.
- Pattern longer than the original string.
- Overlapping occurrences of the pattern.
- Pattern not present in the original string.
Examples
Example 1:
Input: string = "banana", pattern = "ana"
Output: true
Explanation: "ana" is a substring of "banana".
Example 2:
Input: string = "banana", pattern = "xyz"
Output: false
Explanation: "xyz" is not a substring of "banana".
Example 3:
Input: string = "banana", pattern = "na"
Output: 2
Explanation: "na" appears twice in "banana" (at indices 2 and 4).
Example 4:
Input: string = "aaaa", pattern = "aa"
Output: 3
Explanation: "aa" appears three times in "aaaa" (at indices 0, 1, and 2).
Example 5:
Input: string = "", pattern = "a"
Output: false
Explanation: The string is empty, so no pattern can be found.
Constraints
- The input string will contain only lowercase English letters (a-z).
- The pattern string will also contain only lowercase English letters (a-z).
- The length of the input string will be between 0 and 1000 characters (inclusive).
- The length of the pattern string will be between 0 and 100 characters (inclusive).
- Performance: The
containsandcountOccurrencesmethods should complete within a reasonable time (e.g., less than 1 second) for the given constraints.
Notes
- Consider using a map (JavaScript
Mapobject) to represent thechildrenof each node for efficient character-based lookup. - The suffix tree construction can be a complex process. Focus on the
containsandcountOccurrencesmethods first, ensuring they work correctly before optimizing the tree construction. - Think about how to efficiently traverse the tree to find patterns and count occurrences. You'll need to handle cases where the pattern is a prefix of a suffix.
- The suffix tree doesn't need to be perfectly balanced; the primary goal is to correctly represent the suffixes and enable efficient searching.