584. Find Customer Referee - Detailed Explanation
Problem Statement
Description:
You are given a table called Customers that contains customer information. The table has the following columns:
-
customer_id: A unique identifier for each customer.
-
customer_name: The name of the customer.
-
referee_id: The ID of the referee who referred the customer. This value can be NULL if the customer was not referred by anyone. If not NULL, it will refer to another customer in the same table.
Your task is to write an SQL query to report the customer along with the name of their referee. If a customer does not have a referee, return NULL for the referee’s name.
Example 1:
customer_id | customer_name | referee_id |
---|---|---|
1 | Alice | NULL |
2 | Bob | 1 |
3 | Charlie | 1 |
4 | David | 2 |
Expected Output:
customer_id | customer_name | referee_name |
---|---|---|
1 | Alice | NULL |
2 | Bob | Alice |
3 | Charlie | Alice |
4 | David | Bob |
Example 2:
customer_id | customer_name | referee_id |
---|---|---|
1 | John | NULL |
2 | Emma | 1 |
3 | Liam | 2 |
Expected Output:
customer_id | customer_name | referee_name |
---|---|---|
1 | John | NULL |
2 | Emma | John |
3 | Liam | Emma |
Example 3:
customer_id | customer_name | referee_id |
---|---|---|
1 | Sarah | NULL |
2 | Michael | NULL |
3 | Olivia | 2 |
Expected Output:
customer_id | customer_name | referee_name |
---|---|---|
1 | Sarah | NULL |
2 | Michael | NULL |
3 | Olivia | Michael |
Constraints:
- Every non-NULL referee_id in the table references a valid customer_id in the same table.
- The table can contain any number of rows, so efficiency matters.
Hints Before You Start
- Hint 1: Think about how you can join the Customers table with itself. You need to “look up” the referee’s name for each customer.
- Hint 2: Use a LEFT JOIN so that customers who do not have a referee (i.e., where referee_id is NULL) are still included in the result, with a NULL value for the referee’s name.
Approaches
Using a Self-Join
Concept:
Since both the customer and their referee are stored in the same table, you can treat one instance of the table as the primary customer data and a second instance as the reference data for referees. Use table aliases to differentiate between these two roles.
Steps:
- Alias the Table:
- Use alias c for the main customer records.
- Use alias r for the referee records.
- Join Condition:
- Join on c.referee_id = r.customer_id.
- Use LEFT JOIN:
- This ensures that if c.referee_id is NULL (or there is no matching referee), the customer still appears in the result with a NULL value for the referee's name.
Using a Subquery
Concept:
Another approach is to use a subquery to fetch the referee’s name. For each row in the Customers table, select the corresponding referee’s name from the same table using a subquery.
Steps:
- For each customer, write a subquery to select the referee's name where the referee’s customer_id matches the customer’s referee_id.
- This approach can be less efficient than a self-join for larger datasets.
SQL Code Implementation
Using a Self-Join
SELECT c.customer_id, c.customer_name, r.customer_name AS referee_name FROM Customers c LEFT JOIN Customers r ON c.referee_id = r.customer_id;
Explanation:
- FROM Customers c:
Treats the Customers table as the primary table (alias c). - LEFT JOIN Customers r ON c.referee_id = r.customer_id:
Joins the same table (alias r) using the condition that referee_id from the primary table matches customer_id from the referee table. A LEFT JOIN ensures that customers without a referee are included. - SELECT ...:
Selects the customer’s id, name, and the referee's name (if available).
Using a Subquery (Alternative Approach)
SELECT customer_id, customer_name, ( SELECT customer_name FROM Customers WHERE customer_id = c.referee_id ) AS referee_name FROM Customers c;
Explanation:
- For each row in the Customers table (aliased as c), a subquery is executed to retrieve the customer_name from the Customers table where the customer_id matches c.referee_id.
- If no matching row exists (i.e., when referee_id is NULL), the subquery returns NULL.
Complexity Analysis
-
Time Complexity:
- With proper indexing (especially on customer_id), the self-join performs efficiently even on large tables. The join operation is effectively O(n) where n is the number of rows in the Customers table.
-
Space Complexity:
- The query does not require extra space beyond the standard execution environment and the temporary result set.
Step-by-Step Walkthrough and Visual Examples
Visual Example for Self-Join Approach
Given the following Customers table:
customer_id | customer_name | referee_id |
---|---|---|
1 | Alice | NULL |
2 | Bob | 1 |
3 | Charlie | 1 |
4 | David | 2 |
Step 1: Alias the table
- c: Represents the main customer records.
- r: Represents the referee records.
Step 2: Join Condition
- For customer with customer_id = 2 (Bob):
- c.referee_id is 1.
- The join finds a match in alias r where r.customer_id = 1 (Alice).
- For customer with customer_id = 1 (Alice):
- c.referee_id is NULL, so no match is found, and referee_name becomes NULL.
Step 3: Result Construction
-
The final result table:
customer_id customer_name referee_name 1 Alice NULL 2 Bob Alice 3 Charlie Alice 4 David Bob
Common Mistakes
-
Using an INNER JOIN Instead of LEFT JOIN:
An inner join would exclude customers with a NULL referee_id (i.e., those without a referee). -
Forgetting to Alias Tables:
Without aliases, column names become ambiguous, making it unclear which column belongs to which instance of the table. -
Not Handling NULL Values Properly:
Ensure that when a customer does not have a referee, the query returns NULL for the referee's name.
Edge Cases
-
No Referee Present:
Customers with referee_id as NULL must still appear in the final result with referee_name as NULL. -
Self-Referral (if allowed):
If a customer can refer themselves, the query should correctly return the same name for both the customer and referee. -
Data Integrity:
If there are any inconsistencies (e.g., a referee_id that does not exist), a LEFT JOIN will safely return NULL for the missing referee.
Related Problems
GET YOUR FREE
Coding Questions Catalog
