Interactive Byte Pair Encoding (BPE)

An educational tool to visualize the BPE tokenization algorithm. BPE starts with a vocabulary of individual characters and iteratively merges the most frequent adjacent pair of symbols.

1. Input

2. Results

Vocabulary

Merge Steps

3. Final Tokenized Output