What is a prefix tree?

A prefix tree, also known as a trie (from the word "retrieval"), is a specialized tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, where each node represents a single key within the tree, a trie represents a set of keys by storing them in a compact and structured format where common prefixes are shared.

Structure and Characteristics of a Trie

Nodes: Each node in a trie typically contains multiple branches, each corresponding to possible characters of the keys. For instance, in a simple English alphabet trie, each node could have up to 26 children, one for each letter from 'a' to 'z'.
Common Prefixes: Trie structures are highly efficient for operations involving prefixes, as they share the common prefixes of keys. This allows for fast and space-efficient storage and retrieval of keys with common prefixes.
Termination of Words: Nodes in a trie often contain a flag or use a special child node to indicate the end of a key. This helps distinguish between a key that is a prefix of another key and a complete key itself.
Path from Root to Leaf: In a trie, any path from the root to a leaf, or to a node marked as the end of a key, can represent a string stored in the trie.

Advantages of Tries

Space Efficiency: Tries can be more space efficient when storing large sets of strings that share significant prefixes.
Time Complexity: Lookup, insertion, and deletion operations can be performed in (O(m)) time complexity, where (m) is the length of the string. This is because operations are generally performed directly on the length of the string, independent of the number of keys in the trie.
Prefix Related Queries: Tries are particularly well-suited for solving problems that involve searching by prefixes, such as auto-completion in search engines or prefix-based filters.

Example Usage

Consider implementing a simple trie for storing a set of words and checking the existence of words:

class TrieNode {
    private TrieNode[] children = new TrieNode[26];
    private boolean isEndOfWord = false;

    public boolean containsKey(char ch) {
        return children[ch - 'a'] != null;
    }

    public TrieNode get(char ch) {
        return children[ch - 'a'];
    }

    public void put(char ch, TrieNode node) {
        children[ch - 'a'] = node;
    }

    public void setEnd() {
        isEndOfWord = true;
    }

    public boolean isEnd() {
        return isEndOfWord;
    }
}

class Trie {
    private TrieNode root;

    public Trie() {
        root = new TrieNode();
    }

    public void insert(String word) {
        TrieNode node = root;
        for (char ch : word.toCharArray()) {
            if (!node.containsKey(ch)) {
                node.put(ch, new TrieNode());
            }
            node = node.get(ch);
        }
        node.setEnd();
    }

    public boolean search(String word) {
        TrieNode node = root;
        for (char ch : word.toCharArray()) {
            if (!node.containsKey(ch)) {
                return false;
            }
            node = node.get(ch);
        }
        return node.isEnd();
    }
}

Applications

Autocomplete Features: Tries are perfect for implementing autocomplete features in text fields or search engines.
Spell Checkers: Used to store dictionaries for fast lookup and suggestions.
IP Routing: Certain trie variants, like radix tries or Patricia tries, are used in IP routing to handle prefixes efficiently.
Bioinformatics: Tries can be used to handle common sequences in DNA strand analysis.

Conclusion

Tries are a powerful and versatile data structure for handling sets of strings, especially when operations involving prefixes are frequent. They offer efficient performance for add, remove, and search operations and can be adapted to various applications that require prefix-based solutions.

TAGS

Coding Interview

CONTRIBUTOR

Design Gurus Team

GET YOUR FREE

Coding Questions Catalog