Unicode

Unicode basics: one code point per character, round-trip interconvertibility with all existing character sets. But what's a character?

Some fonts have taken on independent meaning, e.g. ℜ is originally Fraktur, but now is math-ese for the reals
Some otherwise equivalent characters appear in preexisting character sets, e.g. mainland Chinese sets with both traditional and simplified characters (坛 vs. 壇, 罈)
Some characters look the same, but aren't, e.g. "o" vs. Cyrillic "о": security issue
Composed vs. decomposed forms (accents like é, Hangul): normalize for search, comparison