Sales Data Transformation with PIVOT and UNPIVOT
This challenge focuses on transforming sales data between a wide format (multiple columns representing different product categories) and a long format (a single column representing product category alongside sales figures). Understanding how to use PIVOT and UNPIVOT is crucial for data analysis and reporting, allowing you to reshape data for different visualization or analytical needs. You will be provided with a dataset in a wide format and asked to reshape it into a long format using SQL.
Problem Description
You are given a table named SalesData containing sales information for various products across different categories. The table currently has columns for each product category (e.g., ElectronicsSales, ClothingSales, HomeGoodsSales). Your task is to transform this data into a long format, where you have a single Category column and a corresponding Sales column. This transformation will be achieved using the UNPIVOT operator.
What needs to be achieved:
- Reshape the
SalesDatatable from a wide format to a long format. - Create a new table (or a result set) with columns
CategoryandSales. - The
Categorycolumn should contain the names of the original product category columns. - The
Salescolumn should contain the corresponding sales values.
Key Requirements:
- The solution must use the
UNPIVOToperator. - The solution should handle cases where some categories might have NULL sales values.
- The solution should be efficient and avoid unnecessary complexity.
Expected Behavior:
The query should return a result set where each row represents a single product category and its corresponding sales value. The Category column should contain the name of the category, and the Sales column should contain the sales amount for that category.
Edge Cases to Consider:
NULLvalues in the sales columns. These should be included in the output with aNULLvalue in theSalescolumn.- The number of product categories might vary. The
UNPIVOToperator should be able to handle this dynamically. - The table might be empty. The query should return an empty result set in this case.
Examples
Example 1:
Input:
SalesData Table:
| ProductID | ElectronicsSales | ClothingSales | HomeGoodsSales |
|-----------|------------------|---------------|----------------|
| 1 | 100 | 50 | 75 |
| 2 | 150 | 75 | 100 |
| 3 | 200 | 100 | 125 |
Output:
Category | Sales
------- | ------
ElectronicsSales | 100
ElectronicsSales | 150
ElectronicsSales | 200
ClothingSales | 50
ClothingSales | 75
ClothingSales | 100
HomeGoodsSales | 75
HomeGoodsSales | 100
HomeGoodsSales | 125
Explanation: The query successfully unpivots the ElectronicsSales, ClothingSales, and HomeGoodsSales columns into a Category and Sales format.
Example 2:
Input:
SalesData Table:
| ProductID | ElectronicsSales | ClothingSales | HomeGoodsSales |
|-----------|------------------|---------------|----------------|
| 1 | 100 | NULL | 75 |
| 2 | 150 | 75 | NULL |
Output:
Category | Sales
------- | ------
ElectronicsSales | 100
ElectronicsSales | 150
ClothingSales | NULL
ClothingSales | 75
HomeGoodsSales | 75
HomeGoodsSales | NULL
Explanation: The query correctly handles NULL values in the sales columns, including them in the output with a NULL value in the Sales column.
Constraints
- The
SalesDatatable will always have aProductIDcolumn and at least three sales category columns. - Sales values will be numeric (integer or decimal).
- The database system used supports the
UNPIVOToperator (e.g., SQL Server, Oracle). - The solution should be reasonably efficient; avoid excessively complex or nested queries.
Notes
- Consider using the
FOR XML PATHtrick (if your database supports it) as an alternative toUNPIVOTifUNPIVOTis not directly available. However, the primary goal is to demonstrate understanding of theUNPIVOTconcept. - Think about how to dynamically handle the product category columns. You don't want to hardcode the column names.
- The order of the categories in the output is not strictly defined, but a consistent order is preferred.
- Pseudocode for the general approach:
// Assume SalesData table exists with ProductID and multiple SalesCategory columns
// 1. Identify the SalesCategory columns dynamically (if possible)
// or hardcode them if dynamic identification is not feasible.
// 2. Use UNPIVOT to transform the data:
// - Input Table: SalesData
// - Columns to Unpivot: SalesCategory columns identified in step 1
// - Output Columns:
// - Category: Name of the SalesCategory column
// - Sales: Value from the SalesCategory column
// 3. Return the resulting table with Category and Sales columns.