The ssstraight story - iameven.com

The not so straight story about moving this blog to Amazon s3.

HTTP vs S

I believe in the move towards an encrypted Internet. Not because of hiding information, but making sure things like user credentials don't ever leak. Even on my static website I want to know what I write, is what you read. Any information is trivial to change on an unencrypted connection, especially at a public hot spot. So I've maintained a SSL certificate for this domain the last 2 years.

I have a Dokku server where adding these certificates are easy. And that's where I've been hosting this site the last 2 years. However, I do tend to experiment on this server, and would like to so without taking out all my web sites (since I have more than 1, you know).

I've been thinking about using s3 (Simple Storage Service) before, and had already moved my heavy files there. The only thing holding me back was that S. Then I found out if I use Amazon Route 53 (Dynamic Name server) and route my traffic through Amazon CloudFront I can host my SSL certificate there and proxy all the traffic to an s3 bucket. And it is quite cheap using SNI (which Amazon warns has poor support but even IE8 is on that list, partially, like everything else with IE8).

iameven.com => Route 53 => CloudFront => s3

So it might take 3 services instead of one, but I figured, what the hell.

Enable website hosting

There are 3 ways to configure an s3 bucket.

The top one is standard and only works on full paths to the specified file.

When a bucket is set to 'Enable website hosting' there are some handy settings exposed for defining an Index Document and an Error Document. The Index Document is the file s3 serves when accessing a folder, a sane default is index.html as that is what both Apache and Nginx uses. If there is no index.html or a specific file, s3 falls back to Error Document (I named mine 404.html). This lets you display some helpful information when a site does not exist instead of the default XML error message s3 usually generates. I would go as far as saying it's essential.

Redirect is handy for setting up redirect from www to root, or reversed.

There's a bucket address and a bucket address

bucket.s3-website-region.amazonaws.com
bucket.s3.amazonaws.com

One work and the other doesn't, for static website hosting through CloudFront, that is. In CloudFront they very helpfully serve up the one that doesn't work as a selectable option. It works if you use it like a CDN only, but if you want it to locate that index.html or 404.html it times out.

We have to use the top one to get the behavior we want. The problem I had with that...

s3 is HTTP only

s3 uses HTTPS all the time except when it doesn't. This really surprised me, as I've accessed tons of files through an amazonaws.com domain on HTTPS. To get website hosting, you have to sacrifice that S. I did not know that, and Amazon did not mention it anywhere, except the occasional googleable forum post. A good place to mention this would be beneath the radio button that expands when you select it, before you save.

Is it really a problem? No. Is this limitation weird? I think so. It did cost me a bunch of time to find this information after it didn't work as expected.

The problem I had, was that I configured CloudFront to use the same protocol as the client was using

http://iameven.com => CloudFront => http://iameven.com.s3-website.amazonaws.com
https://iameven.com => CloudFront => https://iameven.com.s3-website.amazonaws.com

The top one works, I just didn't test that.

15 minutes is a lot of hours

Every config change on CloudFront takes around 15 minutes to propagate. Which I guess is fair, you know what they say, there are only 2 hard problems in computer science:

  1. Cache invalidation (what CloudFront has to do when ever a setting is changed)
  2. Naming things
  3. Off-by-one errors

The documentation on AWS is hard to read, and for everything you want to do there are 3 different pages with 7 different answers depending on your situation. Trying one thing, waiting 15 minutes while reading more documentation, to discover it failing is quite demotivating.

By pure chance I discovered one of those forum posts while googling around.

http://iameven.com => CloudFront => http://iameven.com.s3-website.amazonaws.com
https://iameven.com => CloudFront => https://iameven.com.s3-website.amazonaws.com
-----------------------------------------^
# Removing that S
https://iameven.com => CloudFront => http://iameven.com.s3-website.amazonaws.com

As you can see, all that trouble for a single S. Remove that S, and it suddenly just works.

Doesn't that make it unsafe?

Looking back at me selling that S so much to later remove it makes me look like a hypocrite. However, the visiting browser only needs to talk to CloudFront. CloudFront does an insecure get when there is no fresh cache and serve that file back to the visitor. Which means I put all my trust in Amazon to get the files I tell it to get without modifying it. Hopefully they are a reliable company.

Route 53

Is quite uneventful to set up, there the options presented actually works. My struggle was mainly getting CloudFront to talk to s3 without also exposing my content on the bucket URL. As I'm still not sure how to do that correctly I wont write about it.

Certificates

Amazon expected the key certificate to be in the RSA x509 format. Which is fine, but the error message I got was something along the lines of "this is not a valid chain". I think there are areas of usability Amazon could improve. Also, they should be stored in CloudFront, not an IAM user as the documentation tells you.

Have you seen The Straight Story?

While you're still here... The Straight Story is something as awesomely weird as a David Lynch movie produced by Disney. Which means the language is a bit toned down compared to his other movies. It is about an old man wanting to visit his dying brother and sees the lawnmower as his only option to get from Iowa to Wisconsin. It's slow going and several funny and sad encounters. I highly recommend watching it.

I started thinking about that movie as a metaphor for things not always going as you plan. And for some reason comparing it to my not very drastic (but somewhat frustrating) adventure of configuring web servers.

Lynx Tetris