Simple Query Optimizer for Relational Data

This challenge asks you to implement a basic query optimizer for a simplified relational database system. Query optimization is crucial for efficient data retrieval, and this exercise will help you understand the core concepts of choosing the best execution plan for a given query. You'll focus on optimizing simple SELECT queries with WHERE clauses involving equality comparisons.

Problem Description

You are tasked with building a query optimizer that analyzes a simplified query and selects the most efficient join order for executing it. The query will involve two tables, TableA and TableB, and a WHERE clause that filters based on an equality condition between a column in TableA and a column in TableB. The optimizer should estimate the cost of different join orders (TableA x TableB vs. TableB x TableA) based on the estimated sizes of the tables and the selectivity of the join condition.

What needs to be achieved:

Table Size Estimation: You will be provided with the estimated number of rows in TableA and TableB.
Selectivity Estimation: You will be provided with the estimated selectivity of the join condition (e.g., a selectivity of 0.1 means that approximately 10% of the rows in TableA will match a row in TableB based on the join condition).
Cost Calculation: Calculate the estimated cost of two possible join orders:
- TableA x TableB: Cost = Size(TableA) * Size(TableB)
- TableB x TableA: Cost = Size(TableB) * Size(TableA)
Optimal Join Order Selection: Return the join order with the lower estimated cost. If the costs are equal, return either order.

Key Requirements:

The optimizer should take the table sizes and selectivity as input.
The optimizer should accurately calculate the cost of each join order.
The optimizer should return the optimal join order (either "TableA x TableB" or "TableB x TableA").

Expected Behavior:

The optimizer should consistently choose the join order that minimizes the estimated cost. It should handle cases where the costs are equal gracefully.

Edge Cases to Consider:

Zero Table Size: If either table has a size of 0, the cost of joining with that table is 0.
Selectivity of 0: If the selectivity is 0, no rows will match, and the cost should be 0.
Selectivity of 1: If the selectivity is 1, all rows in the smaller table will match rows in the larger table.

Examples

Example 1:

Input: table_a_size = 1000, table_b_size = 500, selectivity = 0.2
Output: TableA x TableB
Explanation:
Cost(TableA x TableB) = 1000 * 500 = 500000
Cost(TableB x TableA) = 500 * 1000 = 500000
Since the costs are equal, either order is acceptable.  We return TableA x TableB.

Example 2:

Input: table_a_size = 1000, table_b_size = 2000, selectivity = 0.1
Output: TableA x TableB
Explanation:
Cost(TableA x TableB) = 1000 * 2000 = 2000000
Cost(TableB x TableA) = 2000 * 1000 = 2000000
Costs are equal, so either order is acceptable. We return TableA x TableB.

Example 3:

Input: table_a_size = 5000, table_b_size = 100, selectivity = 0.05
Output: TableB x TableA
Explanation:
Cost(TableA x TableB) = 5000 * 100 = 500000
Cost(TableB x TableA) = 100 * 5000 = 500000
Costs are equal, so either order is acceptable. We return TableB x TableA.

Constraints

table_a_size and table_b_size will be non-negative integers.
selectivity will be a float between 0.0 and 1.0 (inclusive).
The function must return a string: either "TableA x TableB" or "TableB x TableA".
The solution should be efficient enough to execute within a reasonable time (e.g., less than 1 second).

Notes

This is a simplified model. Real-world query optimizers consider many more factors, such as indexes, data distribution, and hardware characteristics.
Focus on the core logic of cost estimation and comparison.
You don't need to implement any actual database functionality; just the query optimization logic.
The selectivity represents the fraction of rows in TableA that will match rows in TableB based on the join condition. It's an estimate, not an exact value.