Tries and Applications

String Matching
- starting from a text T, i.e. a string composed of letters in the alphabet A
- find occurrences of a pattern P, i.e. another string composed of letters in the alphabet A
- inside T, the pattern string P could occur many times
  (some occurrences of P could even overlap over other occurrences of P)
- variations of the problem could include "wildcard characters", GREP patterns, etc.
String Matching starting from a static (possibly very large) text string P
- goal: to search for the string P in the string T, we would like to:
- spend no more than O(P) time to search the data structure
- using no more than O(T) space to maintain the data structure in memory

Recall that BST can be extremely unbalanced, and the shape of a BST is determined by the order in which its data records are inserted, a result of object space decomposition.
Alternative to object space decomposition is to predefine the splitting position within the key range for each node in the tree. So the root could be predefined to split the key range into two equal halves, regardless of the particular values or order of insertion for the data records.
A tree data structure based on key space decomposition is called a trie (pronounced as "try"). A trie stores data only in leaf nodes.
A trie is called a binary trie when the trie structure is based on the value of the key interpreted as a binary number (Huffman coding tree is an example of a binary trie). The binary value of the key determines whether to select the left or right branch at any given point during the search. The most significant bit determines the branch direction at the root.

See an example of binary trie:
Alphabet trie (compressed trie)
A compressed trie is space optimized: if a node is the only child of its parent, merge the node with the parent.