someone put peanut butter in my шоколад
i’m a little undecided as to whether i really, really hate trying to track down problems with character encodings, or really enjoy it. there’s something about groveling through hex dumps trying to figure out which bytes are missing, incorrect, or shouldn’t be there in some EUC-JP encoded text, causing it to render funny little chinese characters instead of the correct funny little japanese characters.
i think it is a little surprising that there only two talks at the o’reilly open source conference that touch on internationalization and localization.
at least i’m getting some practical experience getting stuff like this to render correctly. or so i’m told. i may actually know what i’m talking about by the time i have to give the talk.
sam ruby has been writing various interesting things on this topic recently.
it’s a shame in particular that there’s no perl talk dealing with unicode issues. i’m still foggy what magic it is that perl does under the hood with regards to that.
Comments
I gave such a talk a few years ago at OSCon, with tutorial nodes (PDF) available on-line. See also the Unicode chapter in the forthcoming 2nd edition of Advanced Perl Programming, which I promise is really really good.
I had some info on this topic at ApacheCon in my XML and Internationalization talk. The topic could definitely use more exposure.
Add a comment
Sorry, comments on this post are closed.
Well, we often use "Jcode.pm" module (perl) to deal with them. Downloadable from CPAN. Try out "Jcode.pm" chocolate :)
では、また!