blog names and

one of the current problems with is that it does not handle blogs with non-iso-8859-1 encoded names very well. this would be fairly easy to fix, except for the feeds from and blogger—they aren't very consistent in what shows up in the name attribute. sometimes it is actually iso-8859-1 text, sometimes numeric entities are used correctly, and sometimes html named entities (like â) and numeric entities (like €) are double-encoded.

my preference would be for names to not be double-encoded. there's no reason to do that, as far as i can determine. (i don't think allowing html markup in blog names is necessary.)

but i assume the blogger and feeds will never get fixed, so i'm going to have to code up some heuristics to handle the names picked up from those sites.

add a comment

sorry, comments on this post are closed.