Hone logo
Hone
Problems

Python JSON Schema Validator

This challenge involves implementing a JSON Schema validator in Python. You will create a system that can check if a given JSON document conforms to a predefined JSON Schema. This is crucial for data validation, ensuring data integrity, and facilitating robust communication between different systems or services that exchange JSON data.

Problem Description

Your task is to write a Python function that takes two arguments: a JSON Schema and a JSON document. The function should return True if the JSON document is valid according to the schema, and False otherwise.

Key Requirements:

  1. Basic Type Checking: The validator must support basic JSON types: string, number (integers and floats), boolean, null, object, and array.
  2. Object Validation:
    • properties: Validate fields within an object against their respective schemas.
    • required: Ensure that specified properties are present in the JSON object.
    • additionalProperties: Control whether properties not defined in properties are allowed. This can be a boolean or a schema.
  3. Array Validation:
    • items: Validate elements within an array. This can be a single schema for all items or an array of schemas for specific positional items.
    • minItems and maxItems: Enforce the minimum and maximum number of items in an array.
  4. String Validation:
    • minLength and maxLength: Enforce the minimum and maximum length of a string.
    • pattern: Validate strings against a regular expression.
  5. Number Validation:
    • minimum and maximum: Enforce the minimum and maximum values for numbers.
    • exclusiveMinimum and exclusiveMaximum: Enforce strict inequality for minimum and maximum values.
  6. Recursive Validation: The validator should correctly handle nested schemas, including objects within arrays, arrays within objects, and recursive references (though full $ref support is optional for this challenge).

Expected Behavior:

The function should return a boolean value indicating validity. For simplicity, you do not need to return detailed error messages about why a document is invalid, only whether it is valid or not.

Edge Cases to Consider:

  • Empty JSON documents.
  • Schemas that are themselves invalid (though this challenge assumes valid input schemas).
  • null values for optional fields.
  • Schemas with no properties or items defined.

Examples

Example 1: Basic Type and Property Validation

Input Schema:
{
  "type": "object",
  "properties": {
    "name": {"type": "string"},
    "age": {"type": "number"},
    "isStudent": {"type": "boolean"}
  },
  "required": ["name", "age"]
}

Input JSON:
{
  "name": "Alice",
  "age": 30,
  "isStudent": false
}

Output:
True

Explanation: The JSON object has all required properties ("name", "age") and their types match the schema. The optional "isStudent" property is also of the correct type.

Example 2: Array Validation and Constraints

Input Schema:
{
  "type": "array",
  "items": {"type": "integer"},
  "minItems": 2,
  "maxItems": 5
}

Input JSON:
[1, 2, 3, 4]

Output:
True

Explanation: The JSON is an array, all items are integers, and the number of items (4) is between the min (2) and max (5) specified.

Example 3: String Pattern and Additional Properties

Input Schema:
{
  "type": "object",
  "properties": {
    "email": {"type": "string", "pattern": "^\\S+@\\S+\\.\\S+$"}
  },
  "additionalProperties": false
}

Input JSON:
{
  "email": "test@example.com",
  "id": 123
}

Output:
False

Explanation: The "email" property is a valid string according to the pattern. However, the JSON object contains an "id" property, which is not defined in the schema's "properties" and "additionalProperties" is set to false, disallowing extra properties.

Example 4: Nested Structures

Input Schema:
{
  "type": "object",
  "properties": {
    "user": {
      "type": "object",
      "properties": {
        "id": {"type": "number"},
        "tags": {
          "type": "array",
          "items": {"type": "string"}
        }
      },
      "required": ["id"]
    }
  },
  "required": ["user"]
}

Input JSON:
{
  "user": {
    "id": 456,
    "tags": ["python", "ai"]
  }
}

Output:
True

Explanation: The nested "user" object is valid. Its "id" is present and is a number. Its "tags" array contains only strings.

Constraints

  • Your implementation should focus on the core validation logic. You can assume the input JSON Schema and JSON document are well-formed JSON strings that can be parsed by Python's json module.
  • The maximum depth of nested objects/arrays in the schema and JSON document will not exceed 20.
  • The total number of properties in any object and items in any array will not exceed 100.
  • Regular expressions for pattern validation will be syntactically correct.
  • Performance is a secondary concern; correctness and adherence to the schema rules are paramount.

Notes

  • You will likely need to use Python's built-in json module to parse the input JSON strings into Python dictionaries and lists.
  • Consider a recursive approach for validating nested structures.
  • Think about how to handle different types of validation keywords (e.g., type, properties, required, items, pattern, minLength, maxLength, minimum, maximum).
  • For items in arrays, you might need to distinguish between a single schema for all items and a list of schemas for positional validation.
  • Implementing additionalProperties as a boolean is simpler than as a schema. Prioritize the boolean case.
  • While full $ref support is advanced, you can assume schemas do not contain circular references or rely on external schema files for this challenge.
Loading editor...
python