Hone logo
Hone
Problems

Rust TOML Parser Challenge

The TOML (Tom's Obvious, Minimal Language) format is a popular configuration file format due to its human-readable nature. This challenge asks you to implement a parser for a simplified subset of TOML in Rust. Building a TOML parser is a great way to practice string manipulation, data structure design, and error handling in Rust.

Problem Description

Your task is to create a Rust library that can parse a string containing a simplified TOML configuration into a structured Rust representation. The parser should handle basic data types, key-value pairs, nested tables (sections), and arrays.

Key Requirements:

  • Data Types: Support for strings, integers, floats, booleans, and arrays.
  • Key-Value Pairs: Parse simple key-value assignments. Keys can be alphanumeric strings.
  • Tables (Sections): Support nested tables using dot notation (e.g., [section.subsection]).
  • Arrays: Support arrays of primitive types and other arrays.
  • Error Handling: The parser should return informative errors for invalid TOML syntax.
  • Rust Representation: The parsed TOML should be represented using Rust enums and structs that you define.

Expected Behavior:

The parser will take a &str as input and return a Result containing either a structured representation of the TOML data or a custom error type.

Edge Cases to Consider:

  • Empty input string.
  • Comments (lines starting with #).
  • Whitespace handling (leading/trailing whitespace around keys, values, and table headers).
  • Empty tables.
  • Arrays with mixed types (though for this challenge, assume arrays contain homogeneous types).
  • Duplicate keys within the same table (how should this be handled? For simplicity, the last occurrence wins).
  • Invalid syntax (e.g., missing values, misplaced brackets).

Examples

Example 1:

Input:
name = "Rust Parser"
version = 1.0
enabled = true

[owner]
name = "Hone"
dob = 1979-05-27T07:32:00Z # First class dates are not supported in this challenge
// Conceptual Rust representation (e.g., using nested HashMaps or custom structs)
{
    "name": "Rust Parser",
    "version": 1.0,
    "enabled": true,
    "owner": {
        "name": "Hone",
        "dob": "1979-05-27T07:32:00Z" // Represent as string for this challenge
    }
}

Explanation: This TOML defines top-level key-value pairs and a nested table [owner] with its own key-value pairs.

Example 2:

Input:
[database]
server = "192.168.1.1"
ports = [ 8001, 8001, 8002 ]
connection_max = 5000
enabled = true

[database.connection]
type = "postgresql"
// Conceptual Rust representation
{
    "database": {
        "server": "192.168.1.1",
        "ports": [8001, 8001, 8002],
        "connection_max": 5000,
        "enabled": true,
        "connection": {
            "type": "postgresql"
        }
    }
}

Explanation: This example shows a nested table [database.connection] and an array of integers ports.

Example 3: (Error Case)

Input:
name = "Invalid TOML"
version = 1.0

[[invalid_array] # Incorrect syntax for array of tables
Output:
// A descriptive error indicating the syntax error, e.g.:
Err(ParseError::InvalidSyntax { line: 3, message: "Expected ']]' or identifier, found '['" })

Explanation: The third line has incorrect syntax for defining an array of tables, which is not supported by this simplified parser. The parser should report this error.

Constraints

  • The input TOML string will not exceed 1000 lines.
  • Keys will consist of alphanumeric characters and periods for nesting.
  • Values will be strings (enclosed in double quotes), integers, floats, booleans (true/false), or arrays of these types.
  • The parser should aim to be reasonably efficient, completing parsing within a few milliseconds for typical inputs.
  • Do not use external TOML parsing crates (e.g., toml, serde_toml). You must implement the core parsing logic yourself.

Notes

  • Consider defining an enum to represent the different TOML value types (String, Integer, Float, Boolean, Array, Table).
  • A HashMap<String, TomlValue> could be a good starting point for representing tables.
  • Think about how to manage the current table context as you parse the file.
  • Error messages should be as helpful as possible, indicating the line number and a brief description of the problem.
  • For this challenge, you do not need to support dates, times, datetimes, floats with exponents, or multi-line strings. Treat all dates/times as regular strings if they appear.
Loading editor...
rust