Hone logo
Hone
Problems

JSON Schema Validator in Python

JSON Schema is a vocabulary that allows you to annotate and validate JSON documents. This challenge asks you to implement a basic JSON Schema validator in Python. A working validator is crucial for ensuring data integrity, especially when dealing with external APIs or user-provided data, preventing errors and security vulnerabilities.

Problem Description

You are tasked with creating a Python function validate_json(json_data, schema) that validates a given JSON document against a provided JSON Schema. The function should return True if the JSON data conforms to the schema and False otherwise.

The schema will define the expected structure and data types of the JSON data. For simplicity, this implementation will support the following schema keywords:

  • type: Specifies the expected JSON type (e.g., "object", "array", "string", "integer", "boolean", "null").
  • properties: For objects, this keyword defines a dictionary where keys are property names and values are schemas for the corresponding property values.
  • items: For arrays, this keyword defines the schema for the elements within the array.
  • required: For objects, this keyword defines a list of property names that must be present in the JSON data.

The json_data will be a Python dictionary or list representing the JSON document to be validated. The schema will be a Python dictionary representing the JSON Schema.

Expected Behavior:

  • The function should raise a TypeError if either json_data or schema is not of the expected type (dictionary or list for json_data, dictionary for schema).
  • The function should return True if the json_data is valid according to the schema.
  • The function should return False if the json_data is invalid according to the schema.
  • If a required property is missing, the function should return False.
  • Type mismatches should result in a False return value.

Examples

Example 1:

Input:
json_data = {"name": "John Doe", "age": 30, "isStudent": False}
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "isStudent": {"type": "boolean"}
    },
    "required": ["name", "age"]
}
Output: True
Explanation: The JSON data is a valid object with the required properties "name" and "age", and their types match the schema.

Example 2:

Input:
json_data = {"name": "John Doe", "age": "30"}
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    },
    "required": ["name", "age"]
}
Output: False
Explanation: The "age" property has a type of "string" in the JSON data, but the schema specifies "integer".

Example 3:

Input:
json_data = {"name": "John Doe"}
schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"}
    },
    "required": ["name", "age"]
}
Output: False
Explanation: The "age" property is missing from the JSON data, and it is a required property according to the schema.

Example 4:

Input:
json_data = [1, 2, 3]
schema = {
    "type": "array",
    "items": {"type": "integer"}
}
Output: True
Explanation: The JSON data is a valid array of integers.

Constraints

  • json_data will be a Python dictionary or list.
  • schema will be a Python dictionary.
  • The schema will only contain the keywords type, properties, items, and required.
  • The function must handle invalid input types gracefully by raising a TypeError.
  • The function should be reasonably efficient for typical JSON document sizes (up to a few hundred kilobytes).

Notes

  • Consider using Python's built-in type checking capabilities (e.g., isinstance()) to validate data types.
  • Recursion can be helpful for traversing nested JSON structures and schemas.
  • Focus on implementing the core validation logic for the specified schema keywords. More complex schema features (e.g., patterns, enums) are beyond the scope of this challenge.
  • Think about how to handle the required keyword effectively.
  • Error handling is important; ensure your function behaves predictably with invalid input.
Loading editor...
python