Engaging `s3cmd` to Download and Resize Images from DO Spaces
Earlier in this blog there's a
micropost that lays out the problem at hand. In a nutshell, most of the photos I saved in DigitalOcean Spaces are too big to be effective in this blog. They suck up lots of time and bandwidth every time this blog gets rebuilt, and then they are rendered at a fraction of their original size. This is exceptionall wasteful and unnecessary.
So, my task this morning is to install, configure, and engage the s3cmd tools to help get this done. There's DigitalOcean-specific guidance provided in Setting Up s3cmd 2.x with DigitalOcean Spaces so that's where I will begin. Here goes...
This looks to be very simple and straightforward,
brew install s3cmd should do the trick. In a substantially abridged form, that looked like this:
╭─[email protected] ~/GitHub/blog-eleventy-ghost ‹main› ╰─$ brew install s3cmd Running `brew update --auto-update`... ==> Auto-updated Homebrew! ==> Updated Homebrew from 4.0.10 (931327df1) to 4.0.13 (ac0663ae5). Updated 4 taps (dart-lang/dart, sass/sass, homebrew/core and homebrew/cask). ... You have 55 outdated formulae and 1 outdated cask installed. ... ==> Installing s3cmd ==> Pouring s3cmd--2.3.0.ventura.bottle.1.tar.gz 🍺 /usr/local/Cellar/s3cmd/2.3.0: 837 files, 11.4MB ==> Running `brew cleanup s3cmd`... Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP. Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`). ...
That took about 10 minutes, but should be worth it.
Next, to begin configuration I need to fetch the access key:value pair for
images-summittdweller that I created 4 years ago. I have the credentials necessary to view that information at https://cloud.digitalocean.com/account/api/spaces so that's what I've done.
Whoa, that didn't work! I tried to use old access keys but apparently they are in the wrong format? So, I created a new key named
key-for-s3cmd and applied that (see below) and it works.
╭─[email protected] ~/GitHub/blog-eleventy-ghost ‹main› ╰─$ s3cmd --configure Enter new values or accept defaults in brackets with Enter. Refer to user manual for detailed description of all options. Access key and Secret key are your identifiers for Amazon S3. Leave them empty for using the env variables. Access Key [images-summittdweller]: ***This is a secret*** Secret Key [4YPWYNXZZXFDIR6CRPHG]: ***An even bigger secret*** Default Region [US]: Use "s3.amazonaws.com" for S3 Endpoint and not modify it to the target Amazon S3. S3 Endpoint [nyc3.digitaloceanspaces.com]: Use "%(bucket)s.s3.amazonaws.com" to the target Amazon S3. "%(bucket)s" and "%(location)s" vars can be used if the target S3 system supports dns based buckets. DNS-style bucket+hostname:port template for accessing a bucket [%(bucket)s.nyc3.digitaloceanspaces.com]: Encryption password is used to protect your files from reading by unauthorized persons while in transfer to S3 Encryption password [SummittDweller]: Path to GPG program [/usr/local/bin/gpg]: When using secure HTTPS protocol all communication with Amazon S3 servers is protected from 3rd party eavesdropping. This method is slower than plain HTTP, and can only be proxied with Python 2.7 or newer Use HTTPS protocol [Yes]: On some networks all internet access must go through a HTTP proxy. Try setting it here if you can't connect to S3 directly HTTP Proxy server name: New settings: Access Key: ***This is a secret*** Secret Key: ***An even bigger secret*** Default Region: US S3 Endpoint: nyc3.digitaloceanspaces.com DNS-style bucket+hostname:port template for accessing a bucket: %(bucket)s.nyc3.digitaloceanspaces.com Encryption password: SummittDweller Path to GPG program: /usr/local/bin/gpg Use HTTPS protocol: True HTTP Proxy server name: HTTP Proxy server port: 0 Test access with supplied credentials? [Y/n] Y Please wait, attempting to list all buckets... Success. Your access key and secret key worked fine :-) Now verifying that encryption works... Success. Encryption and decryption worked fine :-) Save settings? [y/N] y Configuration saved to '/Users/mark/.s3cfg'
The "More Information" section at the end of the document I used above says that... "You can use our quick reference on s3cmd usage to get started." Ok, let's have a look... very nice! I'm moving immediately to the Save files to your local computer bits.
Fetching Files of the August 2020 Derecho
The photos I saved of damage from our August 2020 Derecho, in a
images-summittdweller/2020-Aug-10-Derecho bucket, are rather depressing and certainly don't need to be super high-res for the web, so let's retrieve those and practice this workflow there first.
The quick reference linked above says to use a command of the form
s3cmd get s3://spacename/path/ --recursive to get all files from a particular folder. In my case, that's...
╭─[email protected] ~/Downloads/2020-derecho-photos ╰─$ s3cmd get s3://images-summittdweller/2020-Aug-10-Derecho/ --recursive WARNING: Empty object name on S3 found, ignoring. download: 's3://images-summittdweller/2020-Aug-10-Derecho/Derecho - 1.png' -> './Derecho - 1.png' [1 of 82] 23486944 of 23486944 100% in 1s 21.04 MB/s done download: 's3://images-summittdweller/2020-Aug-10-Derecho/Derecho - 10.png' -> './Derecho - 10.png' [2 of 82] 20861132 of 20861132 100% in 0s 23.45 MB/s done ... download: 's3://images-summittdweller/2020-Aug-10-Derecho/Photo-2020-08-20-19-51.jpeg' -> './Photo-2020-08-20-19-51.jpeg' [81 of 82] 6277783 of 6277783 100% in 0s 16.86 MB/s done ERROR: Creation of file './for-the-web/' failed (Reason: Is a directory)
Ok, 81 images downloaded. That was sooooooooo much better than the 1000-or-so clicks it would have taken to do that manually! The only gottcha... that last error message about not downloading the
for-the-web directory, which is currently empty anyway. Before I go farther, I'm going to open that DigitalOcean Spaces bucket and change the errant directory name to
web-ready; that's where my "thumbnail" copies of all these images will be uploaded shortly.
Next Step, Resize Those Images
I found some concise and timely advice in the first web post I Googled: How to quickly resize multiple images at once on Mac. It uses exactly the tool I had in mind, the Mac-native
Before begining that process I wanted to see just how much space my 81 images occupy, like so:
╭─[email protected] ~/Downloads/2020-derecho-photos ╰─$ du -sh 2.7G .
2.7G isn't all that much, but it's still much bigger than it needs to be. I'm firing up
Preview now to see if I can reduce all images proportionally to a maximum width of 800 pixels. Be back in a couple of minutes...
Drum roll please...
╭─[email protected] ~/Downloads/2020-derecho-photos ╰─$ du -sh 163M .
Not bad at all. Only took a couple of minutes and it reduced the total image size from 2.7G down to 163M. Not too shabby.
Uploads Complete... Next?
In order to upload the resized images back to DigitalOcean Spaces, I simply used the nice graphical interface available to me at https://cloud.digitalocean.com/spaces/images-summittdweller?path=2020-Aug-10-Derecho%2Fweb-ready and the
Upload option there. It worked nicely, but for whatever reason two of the images failed to upload:
Derecho - 38.png and
Derecho - 39.png. After a little review, and retries, I could not figure out why these wouldn't upload but that's OK because they are both largely redundant with
Derecho - 40.png, so we'll just stick with that one.
So, in the original folder image
Derecho - 1.png has an endpoint of
https://images.summittdweller.com/2020-Aug-10-Derecho/Derecho%20-%201.png. Makes sense. In my new
web-ready directory the reduced-size copy of the same image naturally has an endpoint of
https://images.summittdweller.com/2020-Aug-10-Derecho/web-ready/Derecho%20-%201.png. Fortunately, that's a "match" except for the addition of the
web-ready subdirectory in the path. Exactly what I hoped would happen.
This means the next step is to...
Alter Content to Include the
I used the admin search (magnifying glass) feature of the Ghost instance behind this blog to find all references to
https://images.summittdweller.com/2020-Aug-10-Derecho and it found: NOTHING. Yikes. Turns out all of the references were in one post, and it was draft. So I edited the draft "manually" and published it. Now let's try that search again... NOPE. So, the text inside those
figure tags evidently isn't discoverable. ☹️ Just as well since I think a different approach is needed.
Since editing the content (posts) to pick up the new
web-ready directory is a royal pain, what I'm going to do now is create a new
originals sub-directory in each bucket, download all the images, resize them locally as I did above, but replace the existing high-resolution images with the smaller copies AFTER I make a backup the original images in a new
originals directory. That way I don't have to edit any of my posts in order to get the smaller web-ready images, but I won't lose the originals either. Beautimous.
Belay That Order!
While I'm confident this approach (above) will work, it may not be necessary. I looked, for example, in
images-summittdweller/Norway-Photos-2019/Bergen - Hordaland, June 19, 2019 expecting to find some large images, but there were none over 800px in width. In fact, I believe they are all under 640px wide, so nothing to do there, and that's a good thing.
That's All Folks
A detailed search of
images-summittdweller didn't turn up any more enromous images that might get pushed to the web, so... "That's All Folks", for now at least.