Problem
Two strings, X
and Y
, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X
.
For example, "tars"
and "rats"
are similar (swapping at positions 0
and 2
), and "rats"
and "arts"
are similar, but "star"
is not similar to "tars"
, "rats"
, or "arts"
.
Together, these form two connected groups by similarity: {"tars", "rats", "arts"}
and {"star"}
. Notice that "tars"
and "arts"
are in the same group even though they are not similar. Formally, each group is such that a word is in the group if and only if it is similar to at least one other word in the group.
We are given a list strs
of strings where every string in strs
is an anagram of every other string in strs
. How many groups are there?
Examples
Example 1:
|
|
Example 2:
|
|
Solution
Method 1 - Using Union Find Data structure
The problem is to determine the number of groups of similar strings in a list where each string is an anagram of the others. Two strings are considered similar if they are identical or can be made equivalent by swapping at most two characters.
To solve this problem, we can use depth-first search (DFS) or union-find to group the strings based on their similarities. We will use the union-find approach for this solution, also known as Disjoint Set Union (DSU).
- Union-Find Data Structure:
- Initialize a parent array where each index points to itself initially.
- Define
find
andunion
functions to manage the disjoint sets.
- Checking Similarity:
- Two strings are similar if they can become identical by swapping at most two characters. We will write a helper function to check this condition.
- Processing the Strings:
- Iterate through all pairs of strings. For each pair, if they are similar, union their sets.
- Count the Number of Groups:
- After processing all pairs, count the number of unique parents to get the number of groups.
Code
|
|
|
|
Complexity
- ⏰ Time complexity:
O(n^2 * m)
, wheren
is the number of strings andm
is the length of each string. This accounts for comparing all pairs of strings. - 🧺 Space complexity:
O(n)
, for the union-find data structure.