March, 6, 2024 archives
Introducing Frozen Soup
I made a new thing, which I decided to call Frozen Soup. It creates a single-file version of an HTML page by in-lining all of the images using data:
URLs, and pulling in any CSS and JavaScript files.
It is loosely inspired by SingleFile which is a browser extension that does a similar thing. There are also tools built on top of that which let you automate it, but then you’re spinning up a headless browser, and it all felt very heavyweight. The venerable wget
will also pull down a page and its prerequisites and rewrite the URLs to be relative, but I don’t think it has a comparable single-file output.
This may also exist in other incarnations, this is mostly an excuse for me to practice with Python. As such, it is a very crude first draft right now, but I hope to keep tinkering with it for at least a little while longer.
I have also been contributing some changes and test cases to ArchiveBox, but this is different yet also a little related.
Release early, release often
One of the benefits of starting Frozen Soup from a project template is that someone very smart (Simon) has done all the heavy lifting to make publishing it into the Python ecosystem really easy to do. So after I added a new feature today (pulling in external url(...)
references in CSS inline as data:
URLs), I went ahead and registered the project on PyPI, tagged the release on GitHub, and let the GitHub Actions that were part of the project template do the work of publishing the release. It worked on the first try, which is lovely.
I pushed more changes after I did that release, adding a way to set timeouts and fixing the first issue (that I also filed) about pre-existing data:
URLs getting mangled. I also added a quick-and-dirty server version which allows for getting the single-file HTML version of a page, and makes it a little easier to play around with the single-file version of live URLs without having to deal with saving and opening the files.
So I did a second release.