Hone logo
Hone
Problems

Simple Task Definition Language (TDL) Parser

This challenge asks you to build a simple Domain-Specific Language (DSL) parser in Python. TDL is designed to define a series of tasks with dependencies. A parser will take a TDL definition as input and produce a data structure representing the task graph, allowing for later processing (e.g., task scheduling).

Problem Description

You need to create a Python parser that can interpret a TDL definition. The TDL syntax is as follows:

  • Tasks: Tasks are defined using the task keyword, followed by a task name, and optionally, a list of dependencies enclosed in parentheses.
  • Dependencies: Dependencies are specified as a comma-separated list of task names within the parentheses.
  • Task Definition: The entire definition consists of a series of task declarations, each on a new line.
  • Whitespace: Whitespace (spaces, tabs, newlines) is ignored except within the parentheses of dependencies.

The parser should produce a dictionary where keys are task names and values are lists of their dependencies. If a task has no dependencies, its value should be an empty list.

Key Requirements:

  • The parser must handle task names consisting of alphanumeric characters (a-z, A-Z, 0-9) and underscores.
  • The parser must correctly identify and extract task names and dependencies.
  • The parser must handle tasks with no dependencies.
  • The parser must handle tasks with multiple dependencies.
  • The parser must ignore any lines that do not conform to the task declaration format.

Expected Behavior:

The parser should take a string containing the TDL definition as input and return a dictionary representing the task graph.

Edge Cases to Consider:

  • Empty input string.
  • Lines that start with task but are incomplete (e.g., task my_task).
  • Invalid task names (e.g., containing spaces or special characters other than underscores). These lines should be ignored.
  • Dependencies containing invalid task names. These should be ignored.
  • Multiple spaces within the dependency list.
  • Empty dependency list (e.g., task my_task()).

Examples

Example 1:

Input:
task task1
task task2(task1)
task task3(task1,task2)

Output:

{
    'task1': [],
    'task2': ['task1'],
    'task3': ['task1', 'task2']
}

Explanation: The parser correctly identifies the tasks and their dependencies.

Example 2:

Input:
task task1
task task2
task task3(task1, task2)

Output:

{
    'task1': [],
    'task2': [],
    'task3': ['task1', 'task2']
}

Explanation: Tasks without dependencies have an empty list as their value.

Example 3:

Input:
task task1
task task2(task1, task3)
invalid line
task task4(task1, task2, task5)

Output:

{
    'task1': [],
    'task2': ['task1'],
    'task4': ['task1', 'task2']
}

Explanation: The parser ignores the invalid line and dependencies that don't exist.

Constraints

  • Input Size: The input string can be up to 1000 characters.
  • Task Name Length: Task names can be up to 32 characters long.
  • Dependency List Length: The dependency list can contain up to 10 dependencies.
  • Performance: The parser should complete within 0.5 seconds for the given input size.
  • Input Format: The input will always be a string.

Notes

  • You can use regular expressions or string manipulation techniques to parse the TDL definition.
  • Consider breaking down the problem into smaller, manageable functions (e.g., a function to extract the task name and dependencies from a line).
  • Error handling is not strictly required, but ignoring invalid lines is a key requirement. Focus on correctly parsing valid TDL definitions.
  • The order of tasks in the output dictionary does not matter.
  • Assume that task names will be unique within the input.
Loading editor...
python