june, 22, 2004 archives

php’s dumb xml parsing behavior

steve minutillo, author of feed on feeds, runs headlong into the execrable character encoding behavior of php’s xml parsing functions. hey, i was complaining about that just last year... (via phil ringnalda.)

and a related link, this article from the w3c explains how to deal with encoding issues in forms and has a nice regex that verifies whether a string is valid utf-8.

here’s some links culled from an i18n discussion on the twiki site:

Now that I've looked a bit more, there are many algorithms out there for charset detection, but most are aimed at HTML page auto-detection, and may well not work well for URLs:

i really need to write the slides for my talk at oscon, which will cover exactly this sort of thing.

perspective

it popped into my head to check something recently. the number of blogs added to blo.gs, per day, since june 15:

+------------+-----------+
| added      | new blogs |
+------------+-----------+
| 2004-06-15 |      8118 |
| 2004-06-16 |      8170 |
| 2004-06-17 |      7362 |
| 2004-06-18 |      2512 |
| 2004-06-19 |      4299 |
| 2004-06-20 |      7802 |
| 2004-06-21 |      9264 |
+------------+-----------+

« monday, june 21, 2004 wednesday, june 23, 2004 »