Problem

Given an array of strings wordsDict and two different strings that already exist in the array word1 and word2, return the shortest distance between these two words in the list.

OR

You have a large text file containing words. Given any two words, find the shortest distance (in terms of number of words) between them in the file. Can you make the searching operation in O(1) time? What about the space complexity for your solution?

OR

Find the minimum distance between 2 numbers in the array?

Examples

Example 1:

Input: file = "as was is the as the yahoo you me was the and", word = ["the", "was"]
Output: 0
Explanation: As was is the "as the yahoo you me **was** **the** and"

Example 2:

Input: file = "as was is the as the yahoo you me was the and", words = ["you", "the"]
Output: 2
Explanation: (as was is the as **the** yahoo **you** me was the and)

OR

Example 1:

Input: wordsDict = ["practice", "makes", "perfect", "coding", "makes"], word1 = "coding", word2 = "practice"
Output: 3

Example 2:

Input: wordsDict = ["practice", "makes", "perfect", "coding", "makes"], word1 = "makes", word2 = "coding"
Output: 1

Solution

Method 1 - Two indices and current min

We have already solved this example here for array of integers, same logic has to be applied on array of words. Here is the approach:

  • Maintain two pointers to track the positions of word1 and word2.
  • Iterate through the wordsDict.
  • Update the positions when word1 or word2 is encountered and calculate the distance.
  • Keep track of the minimum distance found during the iteration.

Code

Java
public class Solution {
    public int shortestDistance(String[] wordsDict, String word1, String word2) {
        int p1 = -1, p2 = -1, min = Integer.MAX_VALUE;
        for (int i = 0; i < wordsDict.length; i++) {
            if (wordsDict[i].equals(word1)) {
                p1 = i;
            }
            if (wordsDict[i].equals(word2)) {
                p2 = i;
            }
            if (p1 != -1 && p2 != -1) {
                min = Math.min(min, Math.abs(p1 - p2));
            }
        }
        return min;
    }
}
Python
class Solution:
    def shortestDistance(self, wordsDict: List[str], word1: str, word2: str) -> int:
        p1: int = -1
        p2: int = -1
        min_distance: int = float("inf")
        
        for i, word in enumerate(wordsDict):
            if word == word1:
                p1 = i
            if word == word2:
                p2 = i
            if p1 != -1 and p2 != -1:
                min_distance = min(min_distance, abs(p1 - p2))
                
        return min_distance

Complexity

  • ⏰ Time complexity: O(n), because we are iterating through wordsDict once.
  • 🧺 Space complexity: O(1), since we use only a few variables to track indices and the minimum distance.

Method 2 - Using hashtable

Here is the approach:

  • Preprocess the text by storing the positions of each word in a hash table, which requires a single scan through the text.
  • Query the hash table for the positions of word1 and word2, then compute the minimum distance.

Code

Java
public class Solution {
    private Map<String, ArrayList<Integer>> locations = new HashMap<>();
    // Method to calculate the shortest distance
    public int shortestDistance(String[] wordsDict, String word1, String word2) {
        storeLocations(wordsDict);
        int minDistance = Integer.MAX_VALUE;
        for (int pos1 : locations.get(word1)) {
            int nearestPos2 = modifiedBinarySearch(locations.get(word2), pos1);
            minDistance = Math.min(minDistance, Math.abs(pos1 - nearestPos2));
        }
        return minDistance;
    }
    
    // Preprocessing step to store locations
    public void storeLocations(String[] wordsDict) {
        for (int i = 0; i < wordsDict.length; i++) {
            String word = wordsDict[i];
            locations.computeIfAbsent(word, k -> new ArrayList<>()).add(i);
        }
    }

    // Modified binary search to find the closest position
    private int modifiedBinarySearch(ArrayList<Integer> array, int target) {
        int low = 0, high = array.size();
        while (low < high) {
            int mid = low + (high - low) / 2;
            if (array.get(mid) == target)
                return array.get(mid);
            else if (array.get(mid) > target)
                high = mid;
            else
                low = mid + 1;
        }
        return low < array.size() ? array.get(low) : array.get(array.size() - 1);
    }
}
Python
class Solution:
    def __init__(self):
        self.locations: Dict[str, List[int]] = defaultdict(list)

    def storeLocations(self, wordsDict: List[str]) -> None:
        for index, word in enumerate(wordsDict):
            self.locations[word].append(index)
    
    def modified_binary_search(self, array: List[int], target: int) -> int:
        low, high = 0, len(array)
        while low < high:
            mid = (low + high) // 2
            if array[mid] == target:
                return array[mid]
            elif array[mid] > target:
                high = mid
            else:
                low = mid + 1
        return array[low] if low < len(array) else array[-1]
        
    def shortestDistance(self, wordsDict: List[str], word1: str, word2: str) -> int:
        self.storeLocations(wordsDict)
        min_distance: int = float("inf")
        for pos1 in self.locations[word1]:
            nearest_pos2 = self.modified_binary_search(self.locations[word2], pos1)
            min_distance = min(min_distance, abs(pos1 - nearest_pos2))
        return min_distance

Complexity

  • ⏰ Time complexity: O(n), for preprocessing and nearly O(1) for queries.
  • 🧺 Space complexity: O(n), due to the hash table storing positions of words.