Sentence Similarity 2 - with transitive word pairs

Problem

We can represent a sentence as an array of words, for example, the sentence "I am happy with leetcode" can be represented as arr = ["I","am",happy","with","leetcode"].

Given two sentences sentence1 and sentence2 each represented as a string array and given an array of string pairs similarPairs where similarPairs[i] = [xi, yi] indicates that the two words xi and yi are similar.

Return true if sentence1 and sentence2 are similar, or false if they are not similar.

Two sentences are similar if:

They have the same length (i.e., the same number of words)
sentence1[i] and sentence2[i] are similar.

Notice that a word is always similar to itself, also notice that the similarity relation is transitive. For example, if the words a and b are similar, and the words b and c are similar, then a and c are similar.

Examples

Example 1:

Input: sentence1 = ["great","acting","skills"], sentence2 = ["fine","drama","talent"], similarPairs = [["great","good"],["fine","good"],["drama","acting"],["skills","talent"]]
Output: true
Explanation: The two sentences have the same length and each word i of sentence1 is also similar to the corresponding word in sentence2.

Example 2:

Input: sentence1 = ["I","love","leetcode"], sentence2 = ["I","love","onepiece"], similarPairs = [["manga","onepiece"],["platform","anime"],["leetcode","platform"],["anime","manga"]]
Output: true
Explanation: "leetcode" --> "platform" --> "anime" --> "manga" --> "onepiece".
Since "leetcode is similar to "onepiece" and the first two words are the same, the two sentences are similar.

Example 3:

Input: sentence1 = ["I","love","leetcode"], sentence2 = ["I","love","onepiece"], similarPairs = [["manga","hunterXhunter"],["platform","anime"],["leetcode","platform"],["anime","manga"]]
Output: false
Explanation: "leetcode" is not similar to "onepiece".

Solution

Method 1 - Union Find

To determine if two sentences are similar given transitive similarity relations between word pairs, we need to:

Check that both sentences have the same number of words.
Use a data structure to keep track of the equivalence classes of similar words, leveraging the transitivity of similarity relations.

Here is the approach:

Check Sentence Lengths:
- First, ensure that both sentences have the same number of words, as they can only be similar if their lengths match.
Union-Find Data Structure:
- We’ll use the Union-Find (Disjoint Set Union - DSU) data structure to efficiently manage the groups of similar words based on the provided pairs. Union-Find helps in handling the dynamic connectivity problem efficiently.
Build the Union-Find Structure:
- Initialize Union-Find for all unique words in the similar pairs.
- For each pair, union the two words, ensuring they belong to the same group or equivalence class.
Check Similarity:
- For each word position in the sentences, ensure that the corresponding words either match directly or are in the same equivalence class within the Union-Find structure.

Code

Java

public class Solution {
    public boolean areSentencesSimilar(String[] sentence1, String[] sentence2, List<List<String>> similarPairs) {
        if (sentence1.length != sentence2.length) {
            return false;
        }

        Map<String, String> parent = new HashMap<>();
        for (List<String> pair : similarPairs) {
            union(pair.get(0), pair.get(1), parent);
        }

        for (int i = 0; i < sentence1.length; i++) {
            if (!sentence1[i].equals(sentence2[i]) && !find(sentence1[i], parent).equals(find(sentence2[i], parent))) {
                return false;
            }
        }

        return true;
    }

    private void union(String word1, String word2, Map<String, String> parent) {
        String root1 = find(word1, parent);
        String root2 = find(word2, parent);
        if (!root1.equals(root2)) {
            parent.put(root1, root2);
        }
    }

    private String find(String word, Map<String, String> parent) {
        if (!parent.containsKey(word)) {
            parent.put(word, word);
        }
        while (!word.equals(parent.get(word))) {
            parent.put(word, parent.get(parent.get(word)));  // path compression
            word = parent.get(word);
        }
        return word;
    }
}

Python

class Solution:
    def areSentencesSimilar(self, sentence1: List[str], sentence2: List[str], similarPairs: List[List[str]]) -> bool:
        if len(sentence1) != len(sentence2):
            return False

        parent = {}

        def find(word):
            if word not in parent:
                parent[word] = word
            if parent[word] != word:
                parent[word] = find(parent[word])  # path compression
            return parent[word]

        def union(word1, word2):
            root1 = find(word1)
            root2 = find(word2)
            if root1 != root2:
                parent[root1] = root2

        for word1, word2 in similarPairs:
            union(word1, word2)

        for w1, w2 in zip(sentence1, sentence2):
            if w1 != w2 and find(w1) != find(w2):
                return False

        return True

Complexity

⏰ Time complexity: O(n + k * α(n))
- n is the length of the sentences.
- k is the number of similar pairs.
- α is the inverse Ackermann function, which grows extremely slowly, making k * α(n) nearly linear in practice.
🧺 Space complexity: O(n + k)
- Space complexity mainly comes from storing the parent pointer for each unique word in the similar pairs, which can total to the number of unique words in both sentences and pairs combined.

Problem#

Examples#

Solution#

Method 1 - Union Find#

Code#

Java#

Python#

Complexity#