Hone logo
Hone
Problems

Combining Customer Data from Multiple Sources with UNION

Imagine you have customer data spread across several databases or tables, each containing slightly different information. You need to create a unified view of all customers, combining data from these disparate sources into a single result set. This challenge will test your ability to use the UNION operator in SQL to achieve this, demonstrating a crucial skill for data integration and reporting.

Problem Description

You are tasked with creating a SQL query that combines customer data from two tables: Customers_US and Customers_EU. Both tables contain customer information, but they have slightly different column names. The goal is to create a single result set with a consistent schema, including CustomerID, CustomerName, City, and Country.

The Customers_US table has columns CustomerID, Name, City, and Country. The Customers_EU table has columns CustID, CustomerName, Town, and Region.

You need to map the columns from Customers_EU to the desired output schema as follows:

  • CustID maps to CustomerID
  • CustomerName maps to CustomerName
  • Town maps to City
  • Region maps to Country

The query should return all customers from both tables, with the specified column names. The UNION operator should be used to combine the results. Ensure that the data types of the corresponding columns in both SELECT statements are compatible.

Expected Behavior:

The query should return a result set with the following columns: CustomerID, CustomerName, City, Country. The rows should include all customers from both Customers_US and Customers_EU. The order of the rows in the final result set is not important.

Edge Cases to Consider:

  • What happens if one table is empty? The query should still function correctly and return the data from the non-empty table.
  • What if there are duplicate customers (same CustomerID/CustID, CustomerName, City, Country) in both tables? The UNION operator, by default, will return duplicates. If you need to remove duplicates, you'll need to use UNION ALL and potentially add a DISTINCT clause. (This is not explicitly required for this challenge, but good to consider).
  • Data type mismatches between columns. The query must handle these gracefully, potentially requiring explicit type casting.

Examples

Example 1:

Customers_US:
CustomerID | CustomerName | City      | Country
-----------|--------------|-----------|---------
1          | Alice        | New York  | USA
2          | Bob          | Los Angeles| USA
3          | Charlie      | Chicago   | USA

Customers_EU:
CustID | CustomerName | Town      | Region
-------|--------------|-----------|---------
4      | David        | London    | UK
5      | Eve          | Paris     | France
Output:
CustomerID | CustomerName | City      | Country
-----------|--------------|-----------|---------
1          | Alice        | New York  | USA
2          | Bob          | Los Angeles| USA
3          | Charlie      | Chicago   | USA
4          | David        | London    | UK
5          | Eve          | Paris     | France

Explanation: All customers from both tables are combined into a single result set with the desired column names.

Example 2:

Customers_US:
CustomerID | CustomerName | City      | Country
-----------|--------------|-----------|---------
1          | Alice        | New York  | USA
2          | Bob          | Los Angeles| USA

Customers_EU:
CustID | CustomerName | Town      | Region
-------|--------------|-----------|---------
(Empty Table)
Output:
CustomerID | CustomerName | City      | Country
-----------|--------------|-----------|---------
1          | Alice        | New York  | USA
2          | Bob          | Los Angeles| USA

Explanation: Since Customers_EU is empty, only the customers from Customers_US are returned.

Constraints

  • The SQL dialect should be compatible with standard SQL (e.g., PostgreSQL, MySQL, SQL Server).
  • The input tables Customers_US and Customers_EU will always exist.
  • The number of rows in each table will be less than 1000.
  • The column names in the input tables will always be as described in the Problem Description.

Notes

  • Focus on using the UNION operator correctly to combine the results.
  • Pay close attention to column names and data types. You may need to use aliases (AS) to rename columns and/or type casting functions to ensure compatibility.
  • Consider the potential for duplicate rows and how UNION vs. UNION ALL affects the result. While removing duplicates is not a requirement, understanding the difference is important.
  • The challenge is primarily focused on the UNION operator; complex error handling or data validation is not required.
  • Think about how you would handle situations where the column order is different in the two tables. The provided example assumes a straightforward mapping.
Loading editor...
plaintext