Hone logo
Hone
Problems

Building a Simple Configuration DSL Parser in Python

This challenge focuses on creating a Domain-Specific Language (DSL) parser for a custom configuration format. You will implement a Python program that can read and interpret a simplified configuration language, transforming it into a structured data representation. This is a fundamental skill for many software development tasks, including building configuration systems, scripting languages, and data processing tools.

Problem Description

Your task is to build a Python parser for a simple configuration DSL. This DSL is designed to define settings with key-value pairs, nested sections, and basic list structures. The parser should take a string representing the DSL code as input and produce a Python dictionary representing the parsed configuration.

Key Requirements:

  1. Section Handling: The DSL supports nested sections, denoted by square brackets [].
  2. Key-Value Pairs: Within sections or at the top level, settings are defined as key = value. Values can be strings, integers, booleans (true, false), or lists.
  3. List Support: Lists are defined using parentheses () with comma-separated elements. Elements within a list can be of any supported type.
  4. Comments: Lines starting with # should be ignored.
  5. Whitespace: Leading/trailing whitespace around keys, values, and section names should be ignored.

Expected Behavior:

The parser should correctly interpret the DSL syntax and convert it into a nested Python dictionary.

  • Strings should be represented as Python strings.
  • Integers should be represented as Python integers.
  • Booleans true and false should be represented as Python booleans True and False.
  • Lists should be represented as Python lists.
  • Nested sections should translate to nested dictionaries.

Edge Cases to Consider:

  • Empty input string.
  • Input containing only comments or whitespace.
  • Values containing spaces (should be treated as part of the string).
  • Empty sections.
  • Empty lists.
  • Values that look like numbers but should be treated as strings (e.g., version = "1.0").

Examples

Example 1:

Input:
# This is a sample configuration
database {
  host = localhost
  port = 5432
  enabled = true
}

logging {
  level = info
  file = app.log
  rotations = ( 7, 30, 365 ) # Daily, weekly, yearly
}
Output:
{
  "database": {
    "host": "localhost",
    "port": 5432,
    "enabled": True
  },
  "logging": {
    "level": "info",
    "file": "app.log",
    "rotations": [7, 30, 365]
  }
}

Explanation: The input defines two top-level sections, database and logging. database contains simple key-value pairs. logging contains a list rotations with integer elements. Comments and whitespace are ignored.

Example 2:

Input:
api_key = "your_secret_key_here"
timeout_seconds = 30
feature_flags = ( enabled, beta_test, new_ui )
nested {
  setting1 = value1
  sub_nested [
    option_a = 100
    option_b = false
  ]
}
Output:
{
  "api_key": "your_secret_key_here",
  "timeout_seconds": 30,
  "feature_flags": ["enabled", "beta_test", "new_ui"],
  "nested": {
    "setting1": "value1",
    "sub_nested": {
      "option_a": 100,
      "option_b": False
    }
  }
}

Explanation: This example demonstrates string values, an integer, a list of strings, and nested sections with different value types, including a boolean.

Example 3:

Input:
# Empty configuration
Output:
{}

Explanation: An empty input or input containing only comments should result in an empty dictionary.

Constraints

  • The input DSL string will not exceed 10,000 characters.
  • Section names and keys will be alphanumeric strings (a-z, A-Z, 0-9, and underscore _).
  • Values can be strings (enclosed in double quotes "), integers, booleans (true, false), or lists.
  • Strings may contain spaces but not escaped quotes.
  • The parser should be reasonably efficient, capable of parsing typical configuration files within a few milliseconds.

Notes

  • You can approach this problem using regular expressions for simpler parsing or by implementing a more robust lexer/parser combination (e.g., using libraries or custom logic).
  • Consider how to handle unquoted string values that might be mistaken for booleans or numbers. For this DSL, treat anything not explicitly quoted, a number, or a boolean keyword as a string.
  • A good starting point is to iterate through lines, clean them up, and then parse each line's content.
  • For lists and nested structures, you'll need to manage a parsing state or use recursive parsing.
Loading editor...
python