819. Most Common Word - Detailed Explanation

Free Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog. Take a step towards a better tech career now!

Problem Statement

Given a string paragraph and an array of strings banned, return the most frequent word in the paragraph that is not banned. The answer should be returned in lowercase.

  • The paragraph may contain punctuation and mixed case letters.
  • Words are defined as a sequence of letters, and punctuation should be ignored.
  • It is guaranteed that at least one word that is not banned exists and that the answer is unique.

Example 1

Input:

paragraph = "Bob hit a ball, the hit BALL flew far after it was hit."
banned = ["hit"]

Output:

"ball"

Explanation:
After converting all words to lowercase and removing punctuation, the paragraph becomes:

["bob", "hit", "a", "ball", "the", "hit", "ball", "flew", "far", "after", "it", "was", "hit"]

The word "hit" appears 3 times but is banned. The word "ball" appears 2 times and is the most frequent non-banned word.

Example 2

Input:

paragraph = "a, a, a, a, b,b,b,c, c"
banned = ["a"]

Output:

"b"

Explanation:
After processing, the words are: ["a", "a", "a", "a", "b", "b", "b", "c", "c"]. With "a" banned, "b" is the most frequent word.

Constraints

  • 1 ≤ paragraph.length ≤ 1000
  • 0 ≤ banned.length ≤ 100
  • 1 ≤ banned[i].length ≤ 10
  • The answer is unique, and it is guaranteed that at least one word that is not banned exists.

Hints

  1. String Processing:

    • Convert the entire paragraph to lowercase to handle case-insensitivity.
    • Use regular expressions (or other string manipulation techniques) to replace punctuation with spaces so that words are properly separated.
  2. Frequency Count:

    • Use a hash map (dictionary) to count the frequency of each word in the processed paragraph.
    • Skip any word that is in the banned list.
  3. Finding the Maximum:

    • Iterate over the frequency map to find the word with the highest count among those not banned.

Approach 1: Dictionary (Hash Map) Based Frequency Count

Explanation

  1. Normalize the Paragraph:

    • Convert to lowercase.
    • Remove punctuation (e.g., !?',;.) by replacing them with spaces.
  2. Split into Words:

    • Split the processed paragraph into a list of words using whitespace as a delimiter.
  3. Count Word Frequencies:

    • Use a dictionary to count how many times each word appears.
  4. Filter Banned Words:

    • Iterate over the dictionary and ignore words present in the banned list.
  5. Determine the Most Frequent Word:

    • Track the word with the highest frequency that is not banned.

Complexity

  • Time Complexity: O(n + m)

    • O(n) for processing the paragraph (where n is the number of characters)

    • O(m) for iterating through the words (where m is the number of words)

  • Space Complexity: O(m + k)

    • m for storing the words and their counts, and k for the banned words set.

Python Code

Python3
Python3

. . . .

Java Code

Java
Java

. . . .

Step-by-Step Walkthrough

  1. Normalize the Input:

    • Convert the paragraph to lowercase so that "Bob" and "bob" are treated the same.
    • Replace all punctuation with spaces using a regular expression (Python) or replaceAll (Java).
  2. Split the Paragraph:

    • Split the resulting string into words based on whitespace.
  3. Create a Banned Set:

    • Convert the banned words array into a set to allow O(1) lookups.
  4. Count Frequencies:

    • Iterate over each word in the list.
    • If the word is not in the banned set, update its frequency count in a dictionary (Python) or HashMap (Java).
  5. Determine the Most Frequent Non-Banned Word:

    • Traverse the frequency map and track the word with the highest frequency.

Common Mistakes and Edge Cases

  • Punctuation Handling:

    • Failing to replace punctuation correctly can lead to words being miscounted (e.g., "ball," vs "ball").
  • Case Sensitivity:

    • Not converting the paragraph to lowercase may cause duplicates in different cases.
  • Empty or All Banned Words:

    • Ensure that the banned list is handled properly. The problem guarantees at least one valid word exists, but your solution should still be robust.
  • Multiple Spaces:

    • Extra spaces from punctuation replacement can lead to empty strings when splitting. Ensure your splitting mechanism ignores these.

Alternative Variations

  • Custom Separators:
    • The problem could be modified to use different sets of delimiters for words.
  • Top K Frequent Words:
    • A variation might require returning the top K frequent non-banned words instead of just the most frequent one.
TAGS
leetcode
CONTRIBUTOR
Design Gurus Team
-

GET YOUR FREE

Coding Questions Catalog

Design Gurus Newsletter - Latest from our Blog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
Related Courses
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
Image
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.
;