Minimal TOML Parser in Rust
TOML (Tom's Obvious, Minimal Language) is a configuration file format that is easy to read and write. Implementing a TOML parser allows you to easily load configuration data into your Rust programs, making them more flexible and adaptable. This challenge asks you to build a simplified TOML parser that can handle basic TOML structures.
Problem Description
You are tasked with creating a Rust program that parses a simplified TOML file and extracts key-value pairs from the top-level table. The parser should handle strings, integers, and booleans. Arrays and nested tables are not required for this challenge. The parser should gracefully handle invalid TOML by returning an error.
Key Requirements:
- Data Types: Support parsing of strings (enclosed in double quotes), integers, and booleans (
trueandfalse). - Key-Value Pairs: Extract key-value pairs from the top-level table. Keys are strings, and values can be strings, integers, or booleans.
- Error Handling: Return an error if the TOML is invalid (e.g., invalid syntax, unsupported data types).
- Top-Level Table Only: The parser should only process the top-level table. Nested tables or arrays are not required.
- Whitespace: Ignore whitespace around keys and values.
Expected Behavior:
The program should take a TOML string as input and return a Result containing a HashMap<String, TOMLValue>. If parsing is successful, the HashMap should contain the extracted key-value pairs. If parsing fails, the Result should contain an error message.
Edge Cases to Consider:
- Empty TOML string.
- TOML string with only whitespace.
- Invalid key names (e.g., containing spaces or special characters).
- Invalid values (e.g., a string that should be an integer).
- Missing values after a key.
- Comments (ignore them).
Examples
Example 1:
Input: "key1 = \"value1\"\nkey2 = 123\nkey3 = true"
Output: HashMap containing: {"key1": TOMLValue::String("value1".to_string()), "key2": TOMLValue::Integer(123), "key3": TOMLValue::Boolean(true)}
Explanation: The input TOML string is parsed, and the key-value pairs are extracted into a HashMap.
Example 2:
Input: "key1 = \"value1\"\nkey2 = false\nkey3 = \"another value\""
Output: HashMap containing: {"key1": TOMLValue::String("value1".to_string()), "key2": TOMLValue::Boolean(false), "key3": TOMLValue::String("another value".to_string())}
Explanation: Handles different data types correctly.
Example 3: (Edge Case)
Input: "key1 = \nkey2 = 123"
Output: HashMap containing: {"key1": TOMLValue::String("".to_string()), "key2": TOMLValue::Integer(123)}
Explanation: Handles empty strings after a key.
Constraints
- Input Size: The input TOML string will be no longer than 1024 characters.
- Data Types: Only support strings, integers, and booleans.
- Performance: The parsing time should be reasonable for input strings of the specified size. Optimization is not a primary concern.
- Error Messages: Error messages should be descriptive enough to help identify the parsing error.
Notes
- You'll need to define an
enumcalledTOMLValueto represent the different data types that can be stored in the HashMap. - Consider using regular expressions or string manipulation techniques to parse the TOML string.
- Focus on correctness and clarity over extreme optimization.
- Comments in TOML are denoted by a '#' character. These should be ignored during parsing.
- The order of key-value pairs in the output HashMap is not important.
- You can use the
std::collections::HashMapfor storing the parsed data. - The
Resulttype should beResult<HashMap<String, TOMLValue>, String>.