Analyzing Sales Performance with Window Functions
Many businesses need to analyze sales data to identify top performers, track rankings over time, and understand relative performance within different categories. This challenge focuses on using SQL window functions (specifically ROW_NUMBER and RANK) to derive insights from a sales dataset, allowing you to determine sales representative rankings and identify top performers within specific regions.
Problem Description
You are given a table named Sales containing sales data for a company. The table has the following columns:
SalespersonID(INT): Unique identifier for each salesperson.SalespersonName(VARCHAR): Name of the salesperson.Region(VARCHAR): The region the salesperson operates in.SalesAmount(DECIMAL): The total sales amount for the salesperson.
Your task is to write SQL queries that utilize window functions to:
- Rank Salespeople Globally: Calculate the overall rank of each salesperson based on their
SalesAmountacross all regions. UseRANKto handle ties (salespeople with the same sales amount should have the same rank). - Rank Salespeople by Region: Calculate the rank of each salesperson within their respective
Regionbased on theirSalesAmount. UseROW_NUMBERto assign a unique rank even in the case of ties (breaking ties arbitrarily). - Identify Top 3 Salespeople per Region: Determine the top 3 salespeople in each
Regionbased onSalesAmount. UseROW_NUMBERto assign a rank within each region and then filter the results.
Expected Behavior:
The queries should return tables with the following columns:
- For Global Ranking:
SalespersonID,SalespersonName,Region,SalesAmount,GlobalRank - For Regional Ranking:
SalespersonID,SalespersonName,Region,SalesAmount,RegionalRank - For Top 3 per Region:
SalespersonID,SalespersonName,Region,SalesAmount,RegionalRank
Examples
Example 1:
Input:
Sales Table:
SalespersonID | SalespersonName | Region | SalesAmount
-------------|-----------------|--------|-------------
1 | Alice | North | 100000
2 | Bob | South | 120000
3 | Charlie | North | 90000
4 | David | East | 110000
5 | Eve | West | 80000
6 | Frank | South | 120000
Output (Global Ranking):
SalespersonID | SalespersonName | Region | SalesAmount | GlobalRank
-------------|-----------------|--------|-------------|------------
2 | Bob | South | 120000 | 1
6 | Frank | South | 120000 | 1
4 | David | East | 110000 | 3
1 | Alice | North | 100000 | 4
3 | Charlie | North | 90000 | 5
5 | Eve | West | 80000 | 6
Explanation: Bob and Frank tie for the highest sales amount, so they both receive rank 1. The remaining salespeople are ranked sequentially.
Example 2:
Input:
Sales Table: (Same as Example 1)
Output (Regional Ranking):
SalespersonID | SalespersonName | Region | SalesAmount | RegionalRank
-------------|-----------------|--------|-------------|--------------
1 | Alice | North | 100000 | 1
3 | Charlie | North | 90000 | 2
2 | Bob | South | 120000 | 1
6 | Frank | South | 120000 | 1
4 | David | East | 110000 | 1
5 | Eve | West | 80000 | 1
Explanation: Within each region, salespeople are ranked by sales amount. Bob and Frank tie in the South, so they both receive rank 1.
Example 3:
Input:
Sales Table: (Same as Example 1)
Output (Top 3 per Region):
SalespersonID | SalespersonName | Region | SalesAmount | RegionalRank
-------------|-----------------|--------|-------------|--------------
1 | Alice | North | 100000 | 1
3 | Charlie | North | 90000 | 2
2 | Bob | South | 120000 | 1
6 | Frank | South | 120000 | 1
4 | David | East | 110000 | 1
Explanation: For each region, only the top 3 salespeople based on sales amount are returned.
Constraints
- The
Salestable will contain at least 10 rows. SalesAmountwill be a non-negative decimal number.Regionwill be a string with a maximum length of 50 characters.- The queries should be efficient and perform well on a table with up to 1000 rows.
Notes
- Remember to use the
OVER()clause with the appropriate partitioning and ordering for each window function. - Consider the difference between
RANK()andROW_NUMBER()when handling ties.RANK()assigns the same rank to tied values, whileROW_NUMBER()assigns a unique rank to each row. - For the "Top 3 per Region" query, you'll need to partition the data by
Regionwithin the window function and then filter the results based on the assigned rank. - The specific database system (e.g., PostgreSQL, MySQL, SQL Server) is not specified, so the SQL should be standard and portable.
- Focus on clarity and correctness of the SQL queries. Performance optimization is secondary. Pseudocode for the queries:
Global Ranking:
SELECT
SalespersonID,
SalespersonName,
Region,
SalesAmount,
RANK() OVER (ORDER BY SalesAmount DESC) AS GlobalRank
FROM
Sales;
Regional Ranking:
SELECT
SalespersonID,
SalespersonName,
Region,
SalesAmount,
ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SalesAmount DESC) AS RegionalRank
FROM
Sales;
Top 3 per Region:
SELECT
SalespersonID,
SalespersonName,
Region,
SalesAmount,
ROW_NUMBER() OVER (PARTITION BY Region ORDER BY SalesAmount DESC) AS RegionalRank
FROM
Sales
WHERE
RegionalRank <= 3;