January, 30, 2024 archives
Some notes on Flickr data migration
I decided to stop renewing my subscription with Flickr recently to create some incentive for me to self-host my photos and integrate them more closely here. Before my subscription lapsed, I requested an archive of all of my Flickr data, and now I am finally getting around to working with the data.
When you download your Flickr data it includes JSON files named like photo_50626142.json
and JPEG files named like young-mimes_50626142_o.jpg
and the name of the JPEG is not in the JSON data.
You can generate it, probably, using the name and ID but I’m not sure what the rules are for turning the name field into the snake-case form.
Except that images without a name have JPEG files named like 17483805680_f57f81feb5_o.jpg
. The id is at the beginning, the other bit is just random or something. (Looks like this is the same filename used for the original
URL in the JSON.)
The way to go seems to be just matching on the ID embedded in the filename. (That’s what the one other tool I’ve seen that uses the export data does.)
And when working through all of this, I found that I must have not downloaded one of the archive files from Flickr, because I was missing 83 JPEG files. I was able to use the JSON files to rescue them.
Now that I know that I actually have all of the data and all of the images are in a Backblaze B2 bucket fronted by Gumlet, the next step will be loading all of the relevant metadata into a database table and then wiring up some ways to browse the images here.