I transcode everything to unicode internally, then retranscode to utf-8 on output. (At first I thought I didn't need to touch any encoding which didn't support Han characters, but & escapes may need to be transcoded.) Figuring out the original encoding can be painful:
<META http-equiv="Content-Type" content="text/html; charset=EUC-JP">.Should appear as early as possible in HEAD element. (I currently do a regex search for this pattern before transcoding.)