Byte Pair Encoding ·
Subword Tokenizer
un
seel
.com · Merge the most frequent pair · The tokenizer behind GPT
Merges
0
Vocab
10
State
—
Symbol
Top pair
Merged token
Merge rule
▶ Play
←
→
🔇 Unmute
↻ Reset
Un
seel
.com · Byte Pair Encoding