Word Subsets

Problem

You are given two string arrays words1 and words2.

A string b is a subset of string a if every letter in b occurs in a including multiplicity.

For example, "wrr" is a subset of "warrior" but is not a subset of "world".

A string a from words1 is universal if for every string b in words2, b is a subset of a.

Return an array of all the universal strings in words1. You may return the answer in any order.

Examples

Example 1:

1
2


Input: words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["e","o"]
Output: ["facebook","google","leetcode"]

Example 2:

1
2


Input: words1 = ["amazon","apple","facebook","google","leetcode"], words2 = ["l","e"]
Output: ["apple","google","leetcode"]

Constraints

1 <= words1.length, words2.length <= 10^4
1 <= words1[i].length, words2[i].length <= 10
words1[i] and words2[i] consist only of lowercase English letters.
All the strings of words1 are unique.

Solution

Video explanation

Here is the video explaining below methods in detail. Please check it out:

Method 1 - Naive

The naive approach will be:

For each string in words1, check if it satisfies the subset condition for each string in words2.
And if any point we find out that b is not subset of a, we will skip the word a, otherwise if all words in word2 are subset of word a, we will add a to the list of universal strings.
1. And to find out if b is subset of a, we can use frequency map for both a and b and match the frequencies…that is, for each character in b, the frequency in a is at least as much as that in b.

Method 2 - Using hashmap

To solve this problem, we need to determine which strings from words1 are universal with respect to all the strings in words2. This means we need to find which strings in words1 contain all the letters (with their multiplicities) present in every string in words2.

Approach

Frequency Map:
- We create a frequency map for each string in words2 to track the maximum count of each character needed across all of words2.
Requirement Map:
- Using the frequency maps from words2, we create a composite frequency map that represents the highest count required for each character. This combined map is what each string in words1 must satisfy to be considered universal.
Check Each Word:
- For each word in words1, we check if it contains at least the number of each character specified in the composite frequency map.

Code

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37


class Solution {
    public List<String> wordSubsets(String[] words1, String[] words2) {
        int[] bmax = new int[26];
        for (String b : words2) {
            int[] bCount = count(b);
            for (int i = 0; i < 26; ++i) {
                bmax[i] = Math.max(bmax[i], bCount[i]);
            }
        }

        List<String> ans = new ArrayList<>();
        for (String a : words1) {
            int[] aCount = count(a);
            if (isSubset(aCount, bmax)) {
                ans.add(a);
            }
        }
        return ans;
    }

    private int[] count(String word) {
        int[] frequency = new int[26];
        for (char c : word.toCharArray()) {
            frequency[c - 'a']++;
        }
        return frequency;
    }

    private boolean isSubset(int[] aCount, int[] bCount) {
        for (int i = 0; i < 26; i++) {
            if (aCount[i] < bCount[i]) {
                return false;
            }
        }
        return true;
    }
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25


class Solution:
    def wordSubsets(self, words1: List[str], words2: List[str]) -> List[str]:
        def count(word: str) -> Counter:
            return Counter(word)
        
        # Calculate bmax frequency map
        bmax = Counter()
        for b in words2:
            bCount = count(b)
            for ch in bCount:
                bmax[ch] = max(bmax[ch], bCount[ch])
        
        ans: List[str] = []
        for a in words1:
            aCount = count(a)
            if self._isSubset(aCount, bmax):
                ans.append(a)
        
        return ans
    
    def _isSubset(self, a: Counter, b: Counter) -> bool:
        for char in b:
            if a[char] < b[char]:
                return False
        return True

Complexity

⏰ Time complexity: O(n*lB + m*lA), where m and n are the number of strings in words1 and and word2. lB and lA are the average lengths of a strings in words2 and word1 respectively. OR we can write it as O(B + A) where B is sum of length of all words in word array b, and A is sum of length of words in a.
- Building the frequency map for words2 requires O(n * lB)
- Checking each word in words1 requires O(m * lA)
🧺 Space complexity: O(1) for constant space usage because the frequency maps are limited to 26 characters for the English alphabet.

Problem#

Examples#

Constraints#

Solution#

Video explanation#

Method 1 - Naive#

Method 2 - Using hashmap#

Approach#

Code#

Complexity#