1233. Remove Sub-Folders from the Filesystem - Detailed Explanation
Problem Statement
You are given a list of folder paths from a virtual filesystem. Your task is to remove all sub-folders and return only the root folders. A folder /a/b
is considered a sub-folder if there exists a folder /a
in the list.
Constraints
- (1 \leq folders.length \leq 10^4)
- (2 \leq folders[i].length \leq 100)
- All folder paths are unique.
Examples
Example 1
Input
folders = ["/a","/a/b","/c/d","/c/d/e","/c/f"]
Output
["/a", "/c/d", "/c/f"]
Explanation
/a/b
is a sub-folder of/a
, so it is removed./c/d/e
is a sub-folder of/c/d
, so it is removed./c/f
is not a sub-folder of any other folder, so it remains.
Example 2
Input
folders = ["/a","/a/b/c","/a/b/d"]
Output
["/a"]
Explanation
/a/b/c
and/a/b/d
are sub-folders of/a
, so they are removed.
Example 3
Input
folders = ["/a/b/c","/a/b/ca","/a/b/d"]
Output
["/a/b/c", "/a/b/ca", "/a/b/d"]
Explanation
- No folder is a sub-folder of another.
Hints
- Sorting the folder list lexicographically might help in grouping sub-folders under parent folders.
- Using a Trie or Set can help in efficiently filtering sub-folders.
Approach 1: Brute Force (Checking Each Folder Against All Others)
Idea
- For each folder, check if it is a sub-folder of any other folder in the list.
- A folder
/a/b
is a sub-folder of/a
if/a/
is a prefix of/a/b
.
Algorithm
- Iterate through each folder and check if it starts with any other folder in the list.
- If it is a sub-folder, discard it; otherwise, keep it in the result list.
Time Complexity
- O(N²) in the worst case, where N is the number of folders.
Python Implementation
Python3
Python3
. . . .
Java Implementation
Java
Java
. . . .
Approach 2: Sorting + Greedy Approach (Optimized)
Idea
- Sorting the list lexicographically ensures that a parent folder always appears before its sub-folders.
- Using a single pass, we can check if a folder is a sub-folder of the previous folder.
Algorithm
- Sort the folder list lexicographically.
- Use a single pass to check if each folder starts with the last added folder (
result[-1] + "/"
). - If yes, it's a sub-folder, so we skip it; otherwise, we add it to the result list.
Time Complexity
- O(N log N) due to sorting.
- O(N) for scanning through the folders.
- Overall Complexity: O(N log N).
Python Implementation
Python3
Python3
. . . .
Java Implementation
Java
Java
. . . .
Approach 3: Using Trie (Efficient Approach)
Idea
- Use a Trie (prefix tree) to store folder paths.
- Only insert a folder into the result list if it’s not inside an existing parent folder in the Trie.
Algorithm
- Sort the folder paths.
- Insert each folder into the Trie.
- While inserting, stop if a parent folder is already present in the Trie.
- Keep only top-level folders in the result list.
Time Complexity
- O(N log N) for sorting.
- O(M) for inserting and checking folders, where M is the total number of characters in all folder paths.
- Overall Complexity: O(N log N + M).
Python Implementation
Python3
Python3
. . . .
Java Implementation
Java
Java
. . . .
Complexity Analysis for Each Approach
Approach | Sorting Complexity | Checking Complexity | Total Complexity |
---|---|---|---|
Brute Force | - | (O(N^2)) (Nested loop checking) | (O(N^2)) |
Sorting + Greedy | (O(N \log N)) | (O(N)) (Single pass filtering) | (O(N \log N)) |
Trie-Based Approach | (O(N \log N)) | (O(M)) (where M is total characters in all paths) | (O(N \log N + M)) |
- Brute Force is inefficient for large inputs.
- Sorting + Greedy is optimal for practical use cases.
- Trie-Based Approach is useful when we frequently insert/update folders dynamically.
Step-by-Step Walkthroughs with Visual Examples
Example Input
folders = ["/a","/a/b","/c/d","/c/d/e","/c/f"]
Step 1: Lexicographic Sorting
Index | Folder |
---|---|
1 | /a |
2 | /a/b |
3 | /c/d |
4 | /c/d/e |
5 | /c/f |
Step 2: Iteration (Sorting + Greedy Approach)
Index | Folder | Last Added Folder | Is Sub-folder? | Result |
---|---|---|---|---|
1 | /a | - | No | [/a ] |
2 | /a/b | /a | Yes | [/a ] |
3 | /c/d | /a | No | [/a , /c/d ] |
4 | /c/d/e | /c/d | Yes | [/a , /c/d ] |
5 | /c/f | /c/d | No | [/a , /c/d , /c/f ] |
Trie Insertion Step-by-Step
- Insert
/a
→ Root →a (end=True)
. - Insert
/a/b
→ Stops ata
, asend=True
. - Insert
/c/d
→ Root →c → d (end=True)
. - Insert
/c/d/e
→ Stops atc/d
, asend=True
. - Insert
/c/f
→ Root →c → f (end=True)
.
Final Trie Structure:
Root ├── a (end=True) │ └── b └── c ├── d (end=True) │ └── e └── f (end=True)
Final Output: ["/a", "/c/d", "/c/f"]
Common Mistakes
-
Not Sorting the Input
- Sorting ensures parents appear before sub-folders.
- Example:
["/c/d", "/c/d/e", "/a", "/a/b"]
fails without sorting.
-
Using StartsWith() Incorrectly
folder.startswith(parent)
might incorrectly detect/a/b1
as a sub-folder of/a/b
.
-
Ignoring Edge Cases
- Duplicate paths: Ensure unique folder paths.
- Deeply nested folders: Handle up to length
100
.
Edge Cases
Scenario | Example Input | Expected Output |
---|---|---|
All folders are root | ["/a", "/b", "/c"] | ["/a", "/b", "/c"] |
Deep nesting | ["/a", "/a/b", "/a/b/c/d/e"] | ["/a"] |
Long paths | ["/a" + "/b" * 50] | ["/a" + "/b" * 50] |
No sub-folders | ["/a/b/c", "/a/b/d", "/x/y/z"] | ["/a/b/c", "/a/b/d", "/x/y/z"] |
Alternative Variations
-
Finding Common Parent Paths
- Input:
["/a/b/c", "/a/b", "/a/b/d"]
- Output:
["/a/b"]
- Variation: Find the shortest common path.
- Input:
-
Handling Files Alongside Folders
- Input:
["/a/b/file.txt", "/a/b/c", "/a/b/d"]
- Output:
["/a/b/file.txt", "/a/b"]
- Input:
-
Dynamic Insertion & Querying
- Instead of a static list, allow dynamic insertion of folders.
- Useful in file-system-like implementations.
Related Problems for Further Practice
Problem | Description |
---|---|
[Leetcode 208: Implement Trie (Prefix Tree)] | Learn Trie implementation for prefix searching. |
[Leetcode 648: Replace Words] | Use a Trie to replace words efficiently. |
[Leetcode 1268: Search Suggestions System] | Auto-complete using Trie and sorting. |
[Leetcode 211: Design Add and Search Words Data Structure] | Extend Trie to handle regex-like queries. |
Final Thoughts
- Sorting + Greedy is the best approach for practical use.
- Trie-based approach is great when folders are inserted dynamically.
- Edge cases like deeply nested paths should be handled carefully.
TAGS
leetcode
CONTRIBUTOR
Design Gurus Team
-
GET YOUR FREE
Coding Questions Catalog
Boost your coding skills with our essential coding questions catalog.
Take a step towards a better tech career now!
Explore Answers
Related Courses
Grokking the Coding Interview: Patterns for Coding Questions
Grokking the Coding Interview Patterns in Java, Python, JS, C++, C#, and Go. The most comprehensive course with 476 Lessons.
4.6
(69,299 learners)
New
Grokking Modern AI Fundamentals
Master the fundamentals of AI today to lead the tech revolution of tomorrow.
3.9
(1,107 learners)
Grokking Data Structures & Algorithms for Coding Interviews
Unlock Coding Interview Success: Dive Deep into Data Structures and Algorithms.
4
(26,683 learners)
One-Stop Portal For Tech Interviews.
Copyright © 2025 Design Gurus, LLC. All rights reserved.