someone put peanut butter in my шоколад

i’m a little undecided as to whether i really, really hate trying to track down problems with character encodings, or really enjoy it. there’s something about groveling through hex dumps trying to figure out which bytes are missing, incorrect, or shouldn’t be there in some EUC-JP encoded text, causing it to render funny little chinese characters instead of the correct funny little japanese characters.

i think it is a little surprising that there only two talks at the o’reilly open source conference that touch on internationalization and localization.

at least i’m getting some practical experience getting stuff like this to render correctly. or so i’m told. i may actually know what i’m talking about by the time i have to give the talk.

sam ruby has been writing various interesting things on this topic recently.

it’s a shame in particular that there’s no perl talk dealing with unicode issues. i’m still foggy what magic it is that perl does under the hood with regards to that.

» Tuesday, April 27, 2004 @ 9:22pm » code » 3 comments, add yours

« Monday, April 26, 2004 @ 8:00pm • Wednesday, April 28, 2004 @ 7:24pm »

Comments

Well, we often use "Jcode.pm" module (perl) to deal with them. Downloadable from CPAN. Try out "Jcode.pm" chocolate :)

では、また！

» Tetsuya Kitahata (link) » Tuesday, April 27, 2004 @ 10:58pm

I gave such a talk a few years ago at OSCon, with tutorial nodes (PDF) available on-line. See also the Unicode chapter in the forthcoming 2nd edition of Advanced Perl Programming, which I promise is really really good.

» Simon (link) » Wednesday, April 28, 2004 @ 2:49pm

I had some info on this topic at ApacheCon in my XML and Internationalization talk. The topic could definitely use more exposure.

» Sander van Zoest (link) » Wednesday, April 28, 2004 @ 6:32pm