Hone logo
Hone
Problems

Regular Expression Matching with Wildcards

This challenge explores the fundamental problem of regular expression matching, a crucial skill in text processing and pattern recognition. You'll implement a function that determines if a given string matches a regular expression pattern, where the pattern can contain special characters like '.' (matches any single character) and '*' (matches zero or more occurrences of the preceding element). This is useful for tasks like data validation, search, and text manipulation.

Problem Description

You are tasked with writing a function that takes two strings as input: a text string (text) and a regular expression pattern (pattern). The pattern can contain the following special characters:

  • . (dot): Matches any single character.
  • * (asterisk): Matches zero or more occurrences of the preceding character.

The function should return true if the text string matches the pattern, and false otherwise. A match means that the entire text string can be generated by concatenating substrings that match the pattern.

Key Requirements:

  • The pattern can start or end with *.
  • The pattern can contain consecutive * characters.
  • The pattern can contain multiple . characters.
  • The text string and pattern can be empty.

Expected Behavior:

The function should accurately determine whether the text string matches the pattern, considering the special characters and their meanings. It should handle various edge cases, including empty strings and patterns with only special characters.

Edge Cases to Consider:

  • Empty text string and empty pattern.
  • Empty text string and a pattern containing only *.
  • Pattern starting with *.
  • Pattern ending with *.
  • Consecutive * characters in the pattern.
  • Text string containing characters not matched by the pattern.
  • Pattern containing characters not present in the text string.

Examples

Example 1:

Input: text = "aa", pattern = "a"
Output: false
Explanation: The pattern "a" cannot match the text "aa" because the text is longer.

Example 2:

Input: text = "aa", pattern = "a*"
Output: true
Explanation: The pattern "a*" matches zero or more 'a' characters.  In this case, it matches both 'a's in the text.

Example 3:

Input: text = "ab", pattern = ".*"
Output: true
Explanation: The pattern ".*" matches any sequence of characters (including an empty sequence).  Therefore, it matches the entire text "ab".

Example 4:

Input: text = "aab", pattern = "c*a*b"
Output: true
Explanation: The pattern "c*a*b" matches the text "aab" because "c*" matches zero 'c's, "a*" matches both 'a's, and "b" matches the single 'b'.

Example 5:

Input: text = "mississippi", pattern = "mis*is*p*."
Output: false
Explanation: The pattern "mis*is*p*." does not match the text "mississippi".

Constraints

  • The length of the text string (text) is between 0 and 200.
  • The length of the pattern string (pattern) is between 0 and 200.
  • Both text and pattern will only contain lowercase English letters, and the characters '.', and '*'.
  • The solution should aim for a time complexity of O(m*n), where 'm' is the length of the text and 'n' is the length of the pattern. While a more optimized solution might exist, this is a reasonable target.

Notes

Consider using dynamic programming or recursion with memoization to efficiently solve this problem. The key is to break down the problem into smaller subproblems and reuse the solutions to those subproblems. Think about how the . and * characters affect the matching process and how to handle them correctly. A recursive approach can be intuitive, but be mindful of potential stack overflow issues for very long strings. Dynamic programming can help avoid redundant calculations.

Loading editor...
plaintext