May, 18, 2004 archives
msn.co.kr gets header encoding wrong?
someone posted a question to one of the mysql mailing lists, and the archives weren’t displaying the korean characters correctly.
the encoded bit of the From header looks like =?ks_c_5601-1987?B?7JygIOywve2YuA==?=
, but if i treat the content as utf-8, it is displayed correctly (at least identically to how my mail program displays it). but for the body, i need to recode the content from mscp949 (another name for ks_c_5601-1987, according to this email to ietf-charsets) to utf-8 to get it to display correctly (or at least something resembling correct).
so i can get the message to display correctly (i think), but only by cheating: treating ks_c_5601-1987 as utf-8 when recoding the headers, and as mscp949 when recoding the body. that’s just a little gross.
this will surely cause problems in the face of another mail client that actually uses that character set correctly. it appears to be unique to microsoft mailers, though, so perhaps they got it wrong consistently.
and, of course, it is entirely possible that the results i’m displaying now are completely wrong, and it says rude things about elmo where the MSN advertisement is supposed to be. (although judging from the babelfish translation of the text, i think it is correct.)