Hone logo
Hone
Problems

Product Sales Analysis II: Top Performing Products by Region

Imagine you're a data analyst at a large retail company. You need to identify the top-performing products within each region to optimize inventory and marketing strategies. This challenge requires you to analyze sales data and determine the product with the highest total sales revenue for each region.

Problem Description

You are given a dataset representing product sales across different regions. The dataset is structured as a list of sales records, where each record contains the product name, region, and sales revenue for a single transaction. Your task is to process this data and determine the product with the highest total sales revenue for each region. The output should be a dictionary (or equivalent data structure in your chosen language) where keys are regions and values are the names of the top-performing products in that region.

Key Requirements:

  • Calculate Total Sales per Product per Region: For each region, calculate the total sales revenue for each product.
  • Identify Top-Performing Product: Within each region, determine the product with the highest total sales revenue.
  • Handle Ties: If multiple products have the same highest sales revenue in a region, return the product that appears first in the input data for that region.
  • Handle Empty Input: If the input data is empty, return an empty dictionary.
  • Case-Insensitive Product Names: Treat product names as case-insensitive when calculating totals and identifying the top performer.

Expected Behavior:

The function should take a list of sales records as input and return a dictionary mapping regions to their top-performing products. The function should be robust and handle various input scenarios, including empty datasets, regions with only one product, and ties in sales revenue.

Edge Cases to Consider:

  • Empty input list.
  • Regions with only one product.
  • Multiple products having the same highest sales revenue in a region.
  • Product names with mixed casing.
  • Regions with no sales data.

Examples

Example 1:

Input: [
    {"product": "Laptop", "region": "North", "revenue": 1200},
    {"product": "Keyboard", "region": "North", "revenue": 100},
    {"product": "Mouse", "region": "North", "revenue": 50},
    {"product": "Laptop", "region": "South", "revenue": 1500},
    {"product": "Monitor", "region": "South", "revenue": 800},
    {"product": "Laptop", "region": "East", "revenue": 1000},
    {"product": "Keyboard", "region": "East", "revenue": 200}
]
Output: {"North": "Laptop", "South": "Laptop", "East": "Laptop"}
Explanation: In North, "Laptop" has the highest revenue (1300). In South, "Laptop" has the highest revenue (1500). In East, "Laptop" has the highest revenue (1200).

Example 2:

Input: [
    {"product": "Laptop", "region": "North", "revenue": 1200},
    {"product": "Keyboard", "region": "North", "revenue": 1200},
    {"product": "Mouse", "region": "North", "revenue": 50},
    {"product": "Laptop", "region": "South", "revenue": 1500},
    {"product": "Monitor", "region": "South", "revenue": 800}
]
Output: {"North": "Laptop", "South": "Laptop"}
Explanation: In North, "Laptop" and "Keyboard" have the same revenue. "Laptop" is returned because it appears first. In South, "Laptop" has the highest revenue.

Example 3: (Edge Case - Empty Input)

Input: []
Output: {}
Explanation: The input list is empty, so an empty dictionary is returned.

Constraints

  • The input list will contain between 0 and 1000 sales records.
  • Each sales record will be a dictionary with keys "product" (string), "region" (string), and "revenue" (number).
  • Product names will consist of alphanumeric characters and spaces.
  • Region names will consist of alphanumeric characters and spaces.
  • Revenue will be a non-negative number.
  • The solution should have a time complexity of O(N), where N is the number of sales records.

Notes

  • Consider using a dictionary to store the total sales revenue for each product within each region.
  • Iterate through the input data once to calculate the total sales revenue for each product in each region.
  • After calculating the total sales revenue, iterate through the data again to determine the top-performing product for each region.
  • Remember to handle ties appropriately by returning the product that appears first in the input data.
  • Pay attention to case-insensitive product name comparisons. You might need to convert product names to lowercase before processing.
  • Think about how to efficiently update the total sales revenue for each product in each region as you iterate through the input data.
  • Consider using a defaultdict (or equivalent) to simplify the process of initializing the sales revenue dictionaries.
  • The order of regions in the output dictionary is not important.
Loading editor...
plaintext