Problem

Given a string paragraph and a string array of the banned words banned, return the most frequent word that is not banned. It is guaranteed there is at least one word that is not banned, and that the answer is unique.

The words in paragraph are case-insensitive and the answer should be returned in lowercase.

Examples

Example 1:

Input: paragraph = "Bob hit a ball, the hit BALL flew far after it was hit.", banned = ["hit"]
Output: "ball"
Explanation: 
"hit" occurs 3 times, but it is a banned word.
"ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph. 
Note that words in the paragraph are not case sensitive,
that punctuation is ignored (even if adjacent to words, such as "ball,"), 
and that "hit" isn't the answer even though it occurs more because it is banned.

Example 2:

Input: paragraph = "a.", banned = []
Output: "a"

Solution

Method 1 - Using Set

  1. Normalize the Paragraph:
    • Convert the paragraph to lowercase to handle case-insensitivity.
    • Replace all punctuation marks with spaces.
    • Split the paragraph into words using whitespace.
  2. Use a Set for Banned Words:
    • Convert the list of banned words into a set for O(1) time complexity checks.
  3. Count Frequencies of Non-Banned Words:
    • Use a dictionary (hashmap) to count the occurrences of each word that is not in the banned set.
  4. Find the Most Frequent Word:
    • Iterate through the dictionary to find the word with the highest count.

Code

Java
public class Solution {
    public String mostCommonWord(String paragraph, String[] banned) {
        Set<String> bannedSet = new HashSet<>(Arrays.asList(banned));
        Map<String, Integer> freq = new HashMap<>();
        
        paragraph = paragraph.replaceAll("[^a-zA-Z]", " ").toLowerCase();
        String[] words = paragraph.split("\\s+");
        
        for (String word : words) {
            if (!bannedSet.contains(word)) {
                freq.put(word, freq.getOrDefault(word, 0) + 1);
            }
        }
        
        String ans = "";
        int maxCount = 0;
        for (Map.Entry<String, Integer> entry : freq.entrySet()) {
            if (entry.getValue() > maxCount) {
                maxCount = entry.getValue();
                ans = entry.getKey();
            }
        }
        
        return ans;
    }
}
Python
class Solution:
    def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:
        banned_set = set(banned)
        words = re.findall(r'\w+', paragraph.lower())
        
        freq = defaultdict(int)
        
        for word in words:
            if word not in banned_set:
                freq[word] += 1
        
        ans = ''
        max_count = 0
        for word, count in freq.items():
            if count > max_count:
                max_count = count
                ans = word
        
        return ans

Complexity

  • ⏰ Time complexity: O(n + m), where n is the length of the paragraph and m is the number of banned words. This accounts for processing the paragraph, and for checking and counting word frequencies.
  • 🧺 Space complexity: O(n + m)  for storing the words, their frequencies, and the set of banned words.