Extracting Product Information from JSON Product Catalog
You are tasked with developing a system to manage a product catalog stored in a JSON format within a SQL database. This challenge will test your ability to use SQL's JSON functions to efficiently query and extract specific product details from this semi-structured data. Understanding how to work with JSON in SQL is crucial for modern data management, allowing you to leverage the flexibility of JSON while maintaining the power and structure of relational databases.
Problem Description
You are provided with a table named products that contains a single column, product_data, which stores product information as JSON objects. Each JSON object represents a single product and contains various details such as product_id, name, price, category, and an optional tags array.
Your goal is to write SQL queries that extract specific information from this JSON data. You need to retrieve a list of product names and their prices, but only for products that belong to a specific category and have a certain price threshold.
Key Requirements:
- Select Product Name and Price: Retrieve the
nameandpriceof each product. - Filter by Category: Only include products where the
categorymatches a given string. - Filter by Price: Only include products where the
priceis greater than a given numerical value. - Handle Missing Tags (Optional but good practice): While not strictly required for the primary task, be mindful of how to handle products that might not have a
tagsfield.
Expected Behavior:
The queries should return rows, each containing the name and price of a product that satisfies the specified category and price conditions.
Edge Cases:
- Missing Fields: While the prompt specifies
product_id,name,price, andcategory, a real-world scenario might involve products with missing optional fields liketags. Your JSON extraction logic should be robust enough not to error out if a field is absent. - Data Type Mismatches: Ensure your price comparison correctly handles the price as a numerical value, even if it's stored as a string within the JSON.
Examples
Example 1:
Input:
products table:
| product_data |
|---|
{"product_id": "A101", "name": "Laptop", "price": 1200.50, "category": "Electronics", "tags": ["computer", "portable"]} |
{"product_id": "B202", "name": "T-Shirt", "price": 25.00, "category": "Apparel", "tags": ["clothing", "casual"]} |
{"product_id": "C303", "name": "Keyboard", "price": 75.99, "category": "Electronics"} |
{"product_id": "D404", "name": "Jeans", "price": 50.00, "category": "Apparel", "tags": ["clothing", "denim"]} |
Target Category: "Electronics"
Minimum Price: 100
Output:
| name | price |
|---|---|
| Laptop | 1200.50 |
Explanation: The "Laptop" is in the "Electronics" category and its price (1200.50) is greater than 100. The "Keyboard" is also in "Electronics" but its price (75.99) is not greater than 100.
Example 2:
Input: (Same products table as Example 1)
Target Category: "Apparel"
Minimum Price: 30
Output:
| name | price |
|---|---|
| Jeans | 50.00 |
Explanation: The "Jeans" are in the "Apparel" category and their price (50.00) is greater than 30. The "T-Shirt" is in "Apparel" but its price (25.00) is not greater than 30.
Example 3 (Handling Missing Optional Field):
Input:
products table:
| product_data |
|---|
{"product_id": "E505", "name": "Monitor", "price": 300.00, "category": "Electronics"} |
{"product_id": "F606", "name": "Mouse", "price": 25.00, "category": "Electronics", "tags": ["accessory"]} |
Target Category: "Electronics"
Minimum Price: 200
Output:
| name | price |
|---|---|
| Monitor | 300.00 |
Explanation: The "Monitor" is in the "Electronics" category and its price (300.00) is greater than 200. The "Mouse" is in "Electronics" but its price (25.00) is not greater than 200. The query successfully processed the "Monitor" even though it lacked a tags field.
Constraints
- The
productstable will contain at least one row. - The
product_datacolumn will always contain valid JSON. product_id,name,price, andcategorywill be present in the JSON for all products.pricewill be a numerical value (integer or floating-point) or a string representation of a numerical value.categorywill be a string.namewill be a string.
Notes
- You will need to use specific JSON functions provided by your SQL dialect to extract values from the JSON string. Common functions include
JSON_EXTRACT,JSON_VALUE, or similar depending on your RDBMS (e.g., PostgreSQL, MySQL, SQL Server, SQLite). - Consider how to cast JSON values to appropriate SQL data types (e.g., numeric types for price) for comparisons.
- The
tagsfield is an array of strings. You don't need to query the contents of thetagsarray for this challenge, but your extraction method should gracefully handle its presence or absence.