Python JSON Schema Validator
This challenge involves implementing a JSON Schema validator in Python. You will create a system that can check if a given JSON document conforms to a predefined JSON Schema. This is crucial for data validation, ensuring data integrity, and facilitating robust communication between different systems or services that exchange JSON data.
Problem Description
Your task is to write a Python function that takes two arguments: a JSON Schema and a JSON document. The function should return True if the JSON document is valid according to the schema, and False otherwise.
Key Requirements:
- Basic Type Checking: The validator must support basic JSON types:
string,number(integers and floats),boolean,null,object, andarray. - Object Validation:
properties: Validate fields within an object against their respective schemas.required: Ensure that specified properties are present in the JSON object.additionalProperties: Control whether properties not defined inpropertiesare allowed. This can be a boolean or a schema.
- Array Validation:
items: Validate elements within an array. This can be a single schema for all items or an array of schemas for specific positional items.minItemsandmaxItems: Enforce the minimum and maximum number of items in an array.
- String Validation:
minLengthandmaxLength: Enforce the minimum and maximum length of a string.pattern: Validate strings against a regular expression.
- Number Validation:
minimumandmaximum: Enforce the minimum and maximum values for numbers.exclusiveMinimumandexclusiveMaximum: Enforce strict inequality for minimum and maximum values.
- Recursive Validation: The validator should correctly handle nested schemas, including objects within arrays, arrays within objects, and recursive references (though full
$refsupport is optional for this challenge).
Expected Behavior:
The function should return a boolean value indicating validity. For simplicity, you do not need to return detailed error messages about why a document is invalid, only whether it is valid or not.
Edge Cases to Consider:
- Empty JSON documents.
- Schemas that are themselves invalid (though this challenge assumes valid input schemas).
nullvalues for optional fields.- Schemas with no properties or items defined.
Examples
Example 1: Basic Type and Property Validation
Input Schema:
{
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "number"},
"isStudent": {"type": "boolean"}
},
"required": ["name", "age"]
}
Input JSON:
{
"name": "Alice",
"age": 30,
"isStudent": false
}
Output:
True
Explanation: The JSON object has all required properties ("name", "age") and their types match the schema. The optional "isStudent" property is also of the correct type.
Example 2: Array Validation and Constraints
Input Schema:
{
"type": "array",
"items": {"type": "integer"},
"minItems": 2,
"maxItems": 5
}
Input JSON:
[1, 2, 3, 4]
Output:
True
Explanation: The JSON is an array, all items are integers, and the number of items (4) is between the min (2) and max (5) specified.
Example 3: String Pattern and Additional Properties
Input Schema:
{
"type": "object",
"properties": {
"email": {"type": "string", "pattern": "^\\S+@\\S+\\.\\S+$"}
},
"additionalProperties": false
}
Input JSON:
{
"email": "test@example.com",
"id": 123
}
Output:
False
Explanation: The "email" property is a valid string according to the pattern. However, the JSON object contains an "id" property, which is not defined in the schema's "properties" and "additionalProperties" is set to false, disallowing extra properties.
Example 4: Nested Structures
Input Schema:
{
"type": "object",
"properties": {
"user": {
"type": "object",
"properties": {
"id": {"type": "number"},
"tags": {
"type": "array",
"items": {"type": "string"}
}
},
"required": ["id"]
}
},
"required": ["user"]
}
Input JSON:
{
"user": {
"id": 456,
"tags": ["python", "ai"]
}
}
Output:
True
Explanation: The nested "user" object is valid. Its "id" is present and is a number. Its "tags" array contains only strings.
Constraints
- Your implementation should focus on the core validation logic. You can assume the input JSON Schema and JSON document are well-formed JSON strings that can be parsed by Python's
jsonmodule. - The maximum depth of nested objects/arrays in the schema and JSON document will not exceed 20.
- The total number of properties in any object and items in any array will not exceed 100.
- Regular expressions for
patternvalidation will be syntactically correct. - Performance is a secondary concern; correctness and adherence to the schema rules are paramount.
Notes
- You will likely need to use Python's built-in
jsonmodule to parse the input JSON strings into Python dictionaries and lists. - Consider a recursive approach for validating nested structures.
- Think about how to handle different types of validation keywords (e.g.,
type,properties,required,items,pattern,minLength,maxLength,minimum,maximum). - For
itemsin arrays, you might need to distinguish between a single schema for all items and a list of schemas for positional validation. - Implementing
additionalPropertiesas a boolean is simpler than as a schema. Prioritize the boolean case. - While full
$refsupport is advanced, you can assume schemas do not contain circular references or rely on external schema files for this challenge.