what does a good mailing list web archive look like?

i know the by-date archiving needs to use the time that the mail actually arrived to the server, or you get the sort of oddities that mailman archives get. by-month, threaded archives don't scale particularly well for busy lists. i'm not a big fan of the ezmlm-cgi-style list of threads with links to the latest post. do you want a single message at a time, or multiple posts from a thread at a time, google news-style? (with frames?) i think there's something to be said for a simple list of the most recent posts, with no threading. (but perhaps sorted with the newest at the top is better?) i think displaying the message thread with each message is a must-have feature. i think doing some intelligent coloring of quoted passages is a really neat feature. so is intelligent handling of attachments. using jwz's threading algorithm is probably a good idea (although i would probably use a database to store the results). obviously encoding or obscuring or omitting email addresses in message headers is a good idea, and it's probably a good thing to do in message bodies, too. being able to get the list of messages from a particular author is a nice feature. closing the loop by making it possible to reply to messages using the web interface could satisfy desires for web-based forums. setting up a nntp server is obviously a good thing to do, too.

(yes, this is obviously a work-related musing. don't get too excited, it's not a high-priority item right now.)


I'm happy to say most of those features appear in Archivist :) I still want to improve the threading though - I had a go at implementing jwz's algorithm a couple of days ago but it doesn't mesh very well with a PHP/mySQL solution (though I did use a tip from it to make a small improvement).

» Simon Willison (link) » april 5, 2003 12:27pm

We have most of the threading stuff done over at perl.org actually (with the nntp backend), but some berkeleydb @#$@# on the box where the nntp part is running right now prevents us from deploying it. Planning to move it in the next few weeks.

» Ask Bjoern Hansen (link) » april 5, 2003 2:09pm

yeah, replacing the berkeleydb stuff with mysql in colobus will probably be the first step in anything i do. i just don't know berkeleydb well enough to deal with its quirks. (and that way the web interface could end up either directly using the database, or going over nntp.)

archivist looks pretty cool, but until the source is opened up, it's only really useful for stealing ideas from.

» jim (link) » april 5, 2003 2:26pm

add a comment

sorry, comments on this post are closed.