Problem

Given a list of folders folder, return the folders after removing all sub-folders in those folders. You may return the answer in any order.

If a folder[i] is located within another folder[j], it is called a sub-folder of it. A sub-folder of folder[j] must start with folder[j], followed by a "/". For example, "/a/b" is a sub-folder of "/a", but "/b" is not a sub-folder of "/a/b/c".

The format of a path is one or more concatenated strings of the form: '/' followed by one or more lowercase English letters.

  • For example, "/leetcode" and "/leetcode/problems" are valid paths while an empty string and "/" are not.

Examples

Example 1:

Input: folder = ["/a","/a/b","/c/d","/c/d/e","/c/f"]
Output: ["/a","/c/d","/c/f"]
Explanation: Folders "/a/b" is a subfolder of "/a" and "/c/d/e" is inside of folder "/c/d" in our filesystem.

Example 2:

Input: folder = ["/a","/a/b/c","/a/b/d"]
Output: ["/a"]
Explanation: Folders "/a/b/c" and "/a/b/d" will be removed because they are subfolders of "/a".

Example 3:

Input: folder = ["/a/b/c","/a/b/ca","/a/b/d"]
Output: ["/a/b/c","/a/b/ca","/a/b/d"]

Solution

Method 1 - Sorting

Video explanation

Here is the video explaining this method in detail. Please check it out:

Here is the approach:

  1. Sort the List:
    • First, we sort the list of folders in lexicographical order. This ensures that any sub-folder will immediately follow its parent folder in the list.
  2. Iterate and Check for Sub-folders:
    • We initialize an empty list result to store folders without their sub-folders.
    • We iterate over the sorted list and for each folder, we check if it’s a sub-folder of the last folder added to result. This can be simply done by checking if the current folder starts with the last folder in result followed by a ‘/’.
    • If it is not a sub-folder, we add it to the result list; otherwise, we skip it.

Code

Java
public class Solution {
    public List<String> removeSubfolders(String[] folder) {
        // Sort the array of folders
        Arrays.sort(folder);
        List<String> result = new ArrayList<>();

        for (String f : folder) {
            // If result is empty or f is not a sub-folder of the last element in result
            if (result.isEmpty() || !f.startsWith(result.get(result.size() - 1) + "/")) {
                result.add(f);
            }
        }

        return result;
    }
}
Python
class Solution:
    def removeSubfolders(self, folder):
        # Sort the list of folders
        folder.sort()
        result = []

        for f in folder:
            # If result is empty or f is not a sub-folder of the last element in result
            if not result or not f.startswith(result[-1] + "/"):
                result.append(f)
        
        return result

Complexity

  • ⏰ Time complexity: O(n log n + n*L), where n is the number of folders and L is length of longest parent subfolder
  • Sorting the list takes O(n log n) where n is the number of folders.
  • Iterating through the list takes O(n), but then for each list checking if starts with other word, takes O(n*L)
  • Thus, the overall time complexity is O(n log n + n), which simplifies to O(n log n).
  • 🧺 Space complexity: O(n), for storing the result list res.

Method 2 - Using Trie

We can also use trie like this:

  1. Define a Trie Data Structure:
    • Create a TrieNode class to help with storing paths in a tree structure.
  2. Insert Paths into the Trie:
    • Insert each folder path into the Trie while maintaining the structure. If a node is already marked as the end of a valid path, skip further insertion to avoid adding sub-folders.
  3. Retrieve Valid Paths:
    • Traverse the Trie to collect all paths that are valid and not part of a larger folder structure.

Code

Java
class TrieNode {
    boolean isEnd;
    TrieNode[] children;

    public TrieNode() {
        this.children = new TrieNode[26]; // 26 for each alphabet letter (lowercase)
        this.isEnd = false;
    }
}

public class Solution {
    private TrieNode root;

    public Solution() {
        root = new TrieNode();
    }

    // Method to insert paths into the Trie
    private void insert(String path) {
        TrieNode current = root;
        String[] parts = path.split("/");

        for (String part : parts) {
            if (part.isEmpty()) continue; // Skip empty parts caused by split
            int index = part.charAt(0) - 'a';
            if (current.children[index] == null) {
                current.children[index] = new TrieNode();
            }
            current = current.children[index];
            if (current.isEnd) {
                return; // Early exit if this node marks the end of another valid folder
            }
        }
        current.isEnd = true;
        Arrays.fill(current.children, null); // Clear children to avoid sub-folders
    }

    // Method to collect paths from the Trie
    private void collectPaths(TrieNode node, String path, List<String> res) {
        if (node.isEnd) {
            res.add(path);
            return;
        }
        for (int i = 0; i < 26; i++) {
            if (node.children[i] != null) {
                collectPaths(node.children[i], path + "/" + (char) ('a' + i), res);
            }
        }
    }

    public List<String> removeSubfolders(String[] folder) {
        Arrays.sort(folder);  // Sort folders lexicographically

        for (String f : folder) {
            insert(f);
        }

        List<String> res = new ArrayList<>();
        collectPaths(root, "", res);

        return res;
    }
}
Python
class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False

class Solution:
    def removeSubfolders(self, folder):
        # Create the root TrieNode
        root = TrieNode()

        # Function to insert a folder into the Trie
        def insert(path):
            current = root
            parts = path.split('/')[1:]  # Split path and ignore the first empty part
            for part in parts:
                if part not in current.children:
                    current.children[part] = TrieNode()
                current = current.children[part]
                if current.is_end:
                    return False  # If we already have a valid end folder, no need to continue inserting
            current.is_end = True
            current.children = {}  # Clear children to avoid sub-folders

        # Insert each folder into the Trie
        for f in sorted(folder):
            insert(f)

        # Function to collect valid paths from the Trie
        def collect_paths(node, path, res):
            if node.is_end:
                res.append(path)
                return
            for part in node.children:
                collect_paths(node.children[part], path + '/' + part, res)

        # Collect the result paths
        res = []
        collect_paths(root, '', res)
        
        return res

Complexity

  • ⏰ Time complexity: O(n *L),
    • Each insertion operation takes O(L) where L is the length of the path. For n folders, this results in a total insertion time of O(L * n).
    • Collecting all valid paths involves traversing the Trie, which is O(L * n) in the worst case for n folders of average length L.
  • 🧺 Space complexity: O(n * L)