Suffix Array Construction in JavaScript
Suffix arrays are a fundamental data structure in string algorithms, used for efficient pattern searching, longest common substring identification, and more. This challenge asks you to implement a suffix array construction algorithm in JavaScript. A suffix array for a string is a sorted array of all suffixes of that string.
Problem Description
You are tasked with creating a JavaScript function buildSuffixArray(str) that takes a string str as input and returns a suffix array representing that string. The suffix array should be an array of integers, where each integer represents the starting index of a suffix in the original string. The suffixes should be sorted lexicographically (alphabetical order).
Key Requirements:
- The function must accept a string as input.
- The function must return an array of integers representing the suffix array.
- The suffixes must be sorted lexicographically.
- The returned array should contain the starting indices of the sorted suffixes.
- The function should handle empty strings and strings with repeated characters correctly.
Expected Behavior:
The function should return an array containing the starting indices of the suffixes of the input string, sorted in lexicographical order.
Edge Cases to Consider:
- Empty String: If the input string is empty, the function should return an empty array.
- String with Repeated Characters: The function should correctly handle strings with repeated characters, ensuring that the suffixes are sorted correctly.
- Large Strings: While not a strict requirement, consider the efficiency of your algorithm for potentially large input strings.
Examples
Example 1:
Input: "banana"
Output: [5, 3, 1, 0, 4, 2]
Explanation: The suffixes of "banana" are: "banana", "anana", "nana", "ana", "na", "a". Sorted lexicographically, they are: "a", "ana", "anana", "banana", "na", "nana". The corresponding starting indices are: 5, 3, 1, 0, 4, 2.
Example 2:
Input: "abaaba"
Output: [5, 2, 0, 3, 1, 4]
Explanation: The suffixes of "abaaba" are: "abaaba", "baaba", "aaba", "aba", "ba", "a". Sorted lexicographically, they are: "a", "aaba", "aba", "abaaba", "ba", "baaba". The corresponding starting indices are: 5, 2, 0, 3, 1, 4.
Example 3:
Input: ""
Output: []
Explanation: An empty string has no suffixes, so the suffix array is empty.
Constraints
- The input string
strwill consist of lowercase English letters (a-z). - The length of the input string
strwill be between 0 and 100,000 characters (inclusive). - The time complexity of your solution should be reasonable for strings of length up to 100,000. While O(n^2 log n) is acceptable, more efficient algorithms (e.g., O(n log n)) are preferred.
- The space complexity should be considered, but is less critical than time complexity for this problem.
Notes
- You can use any standard JavaScript data structures and algorithms.
- Consider using a sorting algorithm that is efficient for arrays of numbers.
- A common approach is to create an array of objects, where each object contains the suffix and its starting index, sort the array of objects, and then extract the starting indices to form the suffix array.
- There are several algorithms for suffix array construction (e.g., naive O(n^2 log n) sorting, more advanced O(n log n) algorithms). Choose an algorithm that you are comfortable with and that meets the performance constraints.