Hone logo
Hone
Problems

Sales Performance Analysis with Pivoted Categories and Conditional Aggregation

This challenge requires you to implement a pivot table-like functionality in SQL, incorporating conditional aggregation using the CASE statement. The goal is to analyze sales data, grouping it by product category and calculating total sales for different customer segments based on their purchase amounts. This is a common requirement in business intelligence for summarizing and comparing data across multiple dimensions.

Problem Description

You are given a table named sales with the following columns:

  • product_category (VARCHAR): The category of the product sold (e.g., 'Electronics', 'Clothing', 'Books').
  • customer_id (INT): A unique identifier for each customer.
  • purchase_amount (DECIMAL): The amount spent by the customer on the purchase.

Your task is to write a SQL query that generates a report showing the total sales for each product category, broken down by customer segments. The customer segments are defined as follows:

  • High Value: Customers with purchase_amount greater than or equal to 100.
  • Medium Value: Customers with purchase_amount between 50 (inclusive) and 99 (inclusive).
  • Low Value: Customers with purchase_amount less than 50.

The output should be a table with the following columns:

  • product_category (VARCHAR): The product category.
  • high_value_sales (DECIMAL): Total sales for high-value customers in that category.
  • medium_value_sales (DECIMAL): Total sales for medium-value customers in that category.
  • low_value_sales (DECIMAL): Total sales for low-value customers in that category.

The query should effectively pivot the customer segments (High, Medium, Low) into separate columns, aggregating the purchase_amount based on the product_category and the defined customer segments.

Examples

Example 1:

Input:
sales table:
| product_category | customer_id | purchase_amount |
|------------------|-------------|-----------------|
| Electronics      | 1           | 120             |
| Clothing         | 2           | 60              |
| Electronics      | 3           | 40              |
| Books            | 4           | 80              |
| Clothing         | 5           | 150             |
| Books            | 6           | 30              |

Output:
| product_category | high_value_sales | medium_value_sales | low_value_sales |
|------------------|------------------|--------------------|-----------------|
| Electronics      | 120              | 0                  | 40              |
| Clothing         | 150              | 60                 | 0               |
| Books            | 80               | 0                  | 30              |

Explanation:
- Electronics: High Value = 120, Medium Value = 0, Low Value = 40
- Clothing: High Value = 150, Medium Value = 60, Low Value = 0
- Books: High Value = 80, Medium Value = 0, Low Value = 30

Example 2:

Input:
sales table:
| product_category | customer_id | purchase_amount |
|------------------|-------------|-----------------|
| Electronics      | 1           | 50              |
| Clothing         | 2           | 50              |
| Electronics      | 3           | 100             |
| Books            | 4           | 100              |
| Clothing         | 5           | 100             |
| Books            | 6           | 50              |

Output:
| product_category | high_value_sales | medium_value_sales | low_value_sales |
|------------------|------------------|--------------------|-----------------|
| Electronics      | 100              | 0                  | 50              |
| Clothing         | 100              | 50                 | 0               |
| Books            | 100              | 0                  | 50              |

Explanation:
- Electronics: High Value = 100, Medium Value = 0, Low Value = 50
- Clothing: High Value = 100, Medium Value = 50, Low Value = 0
- Books: High Value = 100, Medium Value = 0, Low Value = 50

Constraints

  • The sales table will always contain at least one row.
  • purchase_amount will always be a non-negative decimal value.
  • product_category will always be a non-empty string.
  • The query should be efficient and perform well on reasonably sized datasets (up to 1 million rows).
  • The SQL dialect should be compatible with standard SQL (e.g., PostgreSQL, MySQL, SQL Server).

Notes

  • You can use the CASE statement to define the customer segments based on the purchase_amount.
  • Consider using conditional aggregation (e.g., SUM(CASE WHEN ... THEN ... ELSE ... END)) to calculate the total sales for each segment within each product category.
  • The order of the columns in the output table is important. Ensure the columns are in the specified order: product_category, high_value_sales, medium_value_sales, low_value_sales.
  • Think about how to group the data by product_category while simultaneously applying the conditional aggregation for the customer segments. A single query is expected.
  • No external libraries or functions are allowed. Only standard SQL functions are permitted.
  • The solution should be readable and well-formatted.
  • Consider edge cases where a product category might have no sales in a particular segment (e.g., no high-value customers for a specific category). In such cases, the corresponding sales value should be 0.
Loading editor...
plaintext