175. Combine Two Tables - Detailed Explanation

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Problem Statement

We are given two tables: Person and Address. The Person table contains information about people, and the Address table contains address details that may correspond to a person. The goal is to combine the two tables so that each person’s information is accompanied by their address details (if available). If a person does not have a corresponding address, the address fields (city and state) should be returned as NULL. The final output must be sorted by personId.

Below are the table schemas:

Person

Column NameType
personIdint
firstNamevarchar
lastNamevarchar

Address

Column NameType
addressIdint
personIdint
cityvarchar
statevarchar

Example 1

Person

personIdfirstNamelastName
1JohnDoe

Address

addressIdpersonIdcitystate
11New YorkNY

Output

personIdfirstNamelastNamecitystate
1JohnDoeNew YorkNY

Explanation: John Doe has an address in New York, so his row is fully populated.

Example 2

Person

personIdfirstNamelastName
1JohnDoe
2JaneSmith

Address

addressIdpersonIdcitystate
12BostonMA

Output

personIdfirstNamelastNamecitystate
1JohnDoeNULLNULL
2JaneSmithBostonMA

Explanation: Jane Smith has an address while John Doe does not, so John’s address fields are NULL.

Example 3

Person

personIdfirstNamelastName
1JohnDoe
2JaneSmith
3BobLee

Address

addressIdpersonIdcitystate
11ChicagoIL
23SeattleWA

Output

personIdfirstNamelastNamecitystate
1JohnDoeChicagoIL
2JaneSmithNULLNULL
3BobLeeSeattleWA

Explanation: John and Bob have addresses, while Jane does not.

Constraints:

  • The Person table always has at least one row.
  • A person may or may not have an associated address in the Address table.
  • The final output must be sorted by personId.

Hints

  1. Join on the Common Key: Identify that both tables share the column personId. This is the key to combine the tables.
  2. LEFT JOIN Concept: Since some persons might not have an address, think of using a LEFT JOIN (or its equivalent when coding) so that every person appears in the output even if the address is missing.

Approaches Overview

There are two primary approaches to solve this problem when simulating the SQL join in code:

Approach 1: Brute Force (Nested Loop Join)

  • Idea: For each row in the Person table, iterate through the Address table to find a matching personId.
  • Explanation:
    • Loop through every person.
    • For each person, loop through all addresses to check if the personId matches.
    • If a match is found, attach the address information; otherwise, set the address fields as NULL.
  • Complexity: O(n * m), where n is the number of persons and m is the number of addresses.
  • Drawbacks: This approach can be inefficient when dealing with large datasets.

Approach 2: Optimal Approach (Hash Join / Left Join)

  • Idea: Use a hash map (or dictionary) to map each personId from the Address table to its address details.

  • Explanation:

    • First, build a hash map from the Address table using personId as the key.
    • Then, iterate over the Person table and quickly look up the corresponding address in the hash map.
    • If found, merge the details; if not, attach NULL for the address fields.
  • Complexity: O(n + m) because building the map is O(m) and iterating through persons is O(n).

  • Advantages: Much more efficient for large datasets.

Below are detailed code implementations for the optimal approach in Python and Java.

Python Implementation (Optimal Approach)

Python3
Python3

. . . .

Python Walkthrough

  1. Building the Address Map:
    A dictionary comprehension creates a mapping of personId to address details, allowing constant time lookup.

  2. Iterating through Persons:
    For each person, we check if there is a corresponding address in the dictionary. If not, we default to {'city': None, 'state': None}.

  3. Sorting and Output:
    The result is sorted by personId (ensuring correct order) and then printed row by row.

Java Implementation (Optimal Approach)

Java
Java

. . . .

Java Walkthrough

  1. Data Classes:
    We define classes for Person, Address, and JoinedRow to encapsulate the table data.

  2. Building the Address Map:
    A HashMap is used to map each personId from the addresses to its corresponding address details.

  3. Performing the Join:
    We iterate over the persons list and for each person, use the map to find the corresponding address. If an address is missing, null values are used for city and state.

  4. Sorting and Printing:
    The resulting list is sorted by personId and printed to the console.

Complexity Analysis

  • Brute Force Approach: O(n * m) time complexity (inefficient for large datasets).
  • Optimal (Hash Join) Approach: O(n + m) time complexity, where n is the number of persons and m is the number of addresses. This is much more efficient.

Step-by-Step Walkthrough and Visual Example

  1. Initialization:
    Imagine you have three persons and two addresses. Create a mapping for addresses using personId as the key.
  2. Processing Each Person:
    • For person with personId = 1, find the address from the map.
    • For person with personId = 2, no address exists; assign NULL to city and state.
    • For person with personId = 3, find the address from the map.
  3. Final Output:
    The joined table is then sorted by personId and printed.

Common Mistakes

  • Using an Inner Join Instead of a Left Join:
    This would omit persons without an address.
  • Not Sorting the Result:
    The final output must be sorted by personId.
  • Inefficient Looping:
    Using nested loops without a hash map can lead to poor performance with larger datasets.

Edge Cases

  • No Address Records:
    All address fields should be NULL.

  • Empty Person Table:
    Although the problem guarantees at least one row, always consider handling empty inputs gracefully.

  • Multiple Addresses for One Person:
    This problem assumes a one-to-one relationship; if multiple addresses exist, decide whether to return one record or multiple records per person.

Alternative Variations

  • Inner Join Variation:
    Only return persons who have an associated address.

  • Aggregation Variation:
    If a person has multiple addresses, you might need to aggregate the address details into a single row (for example, concatenating cities).

  • Additional Filtering:
    You could extend the query to filter results based on conditions in either table (e.g., only persons from a specific state).

TAGS
leetcode
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
Can I be a software engineer without a CS degree?
Is it possible to negotiate a higher salary?
What is a behavioural based job interview?
Related Courses
Image
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Image
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
Image
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.
;