A Better Solution! A couple of years after this blog post was written, Amazon added features to S3/Route 53 to allow static sites with or without the “www” name, so you don’t need to follow the below rather complicated routine. See their blog post on the matter for full details and a walkthrough of the process in the AWS Console.
Warning: this post has a very high geek factor. I wouldn’t read this one unless you’re familiar with both Amazon S3, and with DNS, and interested in setting up a Default Root Object in CloudFront (which lets you serve a website from S3 without needing a filename on the end of the URL.)
S3 now (since February 2011) allows you to configure a default document for your S3 folders, which means you don’t need the CloudFront workaround for default root objects described as part of this blog entry. See Amazon’s blog entry for the announcement. I figured I’d leave this entry up, as it still shows you how to host a website entirely in S3 and how to use Amazon’s CloudFront CDN to serve it up efficiently across the planet, so it’s mostly still useful information. Thanks to Eric Pecoraro for pointing this out in the comments!
Last year I’d idly wondered about hosting an entire website on Amazon S3. It’s pretty cheap, and you only pay for what you use. There’s no fixed monthly fee, unlike most hosting plans. It can serve files directly to the web quite easily, and you can point your own domains at it.
Turned out the main fly in the ointment was that S3 doesn’t have the concept of a default file to serve. That is, if I were to host www.example.com on S3, and someone visited it using “http://www.example.com”, the S3 server – unlike normal web servers – doesn’t look for an “index.html”, or whatever. Instead, S3 serves up either a listing of the underlying S3 bucket, in XML format, or a 403 Forbidden if your bucket isn’t world-readable. You can see what it looks like in your browser by going to my basic S3 bucket address here.
If you want to send people to your S3-based content, therefore, you’ll either need your own traditional web server to wrap your S3 content, or always provide them with a filename on the end of the URL you give them, e.g. http://www.example.com/index.html. This isn’t always ideal.
Enter the recent Cloudfront announcement:
“Amazon CloudFront, the easy to use content delivery network, now supports the ability to assign a default root object to your HTTP or HTTPS distribution. This default object will be served when Amazon CloudFront receives a request for the root of your distribution…”
CloudFront isn’t S3, but it is Amazon’s own content delivery system, and it delivers the content straight from S3’s buckets. So, you can now use CloudFront to solve the default file problem – put your index.html file into your S3 bucket, and use CloudFront to serve it as the default file when someone visits your site.
In this article, I’ll go through how I created a simple website in an S3 bucket, then started serving it out under my own domain name.
Into The Bucket
First, I put my website in a bucket. I fired up Transmit and created a (North American area) bucket called “www.onthescales.com”. Then I threw my website’s files in it, including the main page, “index.html”. I made the bucket and all the files “world readable” in S3. (Any S3 client, for example the S3Fox Firefox extension, should let you do this.)
Because of the way S3 interacts with web requests, you can see the website if you point your browser at http://www.onthescales.com.s3.amazonaws.com/index.html. The S3 servers take everything before the “s3.amazonaws.com” as a bucket name, then find the named file on the path provided, in this case “index.html”.
To make things friendlier, I took the domain name that I owned, “www.onthescales.com” and created a DNS CNAME record for it that pointed it at “s3.amazonaws.com”. Now, http://www.onthescales.com/index.html worked – I could see the website there.
Unfortunately, this is as far as you can get using S3 alone. If you were to go to http://www.onthescales.com with this setup, all you’d see would be that list of the files in the bucket. So, if I wanted to tell everyone about my website, I’d still have to make sure they included “index.html” on the end of the URL: http://www.onthescales.com/index.html. Not very friendly.
CloudFront to the Rescue
Enter CloudFront. First off, I needed to start serving my files using CloudFront pointing at my existing S3 bucket.
I followed the “Migrating from Amazon S3 to CloudFront” instructions in the Amazon CloudFront Developer Guide:
First I signed up for CloudFront. If you already have an S3 account, it’s just a couple of clicks, and a few seconds waiting for the “you’re signed up” email.
Then, I signed in to the AWS management console, and headed for the CloudFront section. As I’d not done anything with CloudFront before, I only had one option: Create Distribution. A Distribution is the basic object in CloudFront. A Distribution is configured to serve up an S3 bucket in a particular way.
The Create Distribution prompted me for the S3 bucket I wanted to serve through CloudFront. I chose www.onthescales.com, and also told it the CNAME I wanted to use (the same value in this case; www.onthescales.com.) I hit “Create”.
When you make changes in CloudFront, it takes a while to filter through. The status of your Distribution in the Console will be “InProgress” while stuff is still going on. My Distribution held at “InProgress” for five or ten minutes before changing to “Deployed”. (Just hit the “Refresh” button every now and again to check.)
Each Distribution is assigned a DNS name. I could see from the console that my new distribution was “d3suf24s2p7o32.cloudfront.net”. Were my S3 objects being served there? I went to http://d3suf24s2p7o32.cloudfront.net/index.html, and lo and behold! There was my website.
So, the next stage was to re-point my domain name at the CloudFront distribution. I fired up my domain provider’s DNS panel and changed the CNAME record for www.onthescales.com to point it at d3suf24s2p7o32.cloudfront.net instead of s3.amazonaws.com.
Happily, I use OpenDNS’s DNS servers, so rather than waiting for the change in the CNAME record to trickle through I just went and prodded their CacheCheck service, and could see that “www.onthescales.com” was now pointing at CloudFront.
Now, while it’s impressive that my tiny website is now being delivered by an international CDN, at this point nothing’s actually changed for the end-user. You’d still have to go to http://www.onthescales.com/index.html to see the site.
Default Root Object
That’s where CloudFront’s Default Root Object comes in. All I needed to do was set my Distribution’s Default Root Object to “index.html” and I’d be done.
Unfortunately, it’s surprisingly difficult to set the Default Root Object. The “standard” way is to send an authenticated POST request to update your distribution’s configuration, as described in the Developer Guide, under “Creating a Default Root Object” section under “Working with Objects”.
But that’s just too damn complicated, and actually requires some programming. Was there an easier way?
The AWS Management Console doesn’t do the job, sadly. And Panic’s Transmit, my tranditional Mac FTP and S3 client, doesn’t seem to have much in the way of CloudFront options.
Luckily, Transmit’s free, open source competitor Cyberduck just seems to be getting better and better, and gives you a nice simple interface to this slightly esoteric bit of CloudFront. (Anyone know if there’s an option for Windows or Linux users? Cyberduck seems to be in private beta for Windows at the moment, at least, so maybe there’s some hope there…)
EDIT: Andy from CloudBerry Lab says that CloudBerry Explorer Freeware has this capability under Windows, so that’s worth a look.
In Cyberduck, I connected to my S3 account, then right-clicked on my “www.onthescales.com” bucket and chose “Info”. Then I went to the “Distribution (CDN)” tab and chose “index.html” from the drop-down list that appeared under “Default Root File”. Voilà! Vastly easier than fiddling with authenticated POST requests and XML files yourself. Thank you, Cyberduck! Worth a donation, I reckon.
However, http://www.onthescales.com/ still just gave me a raw text listing of the bucket contents. I headed into my AWS console to see what I could see. Aha! The “State” was back to “InProgress” rather than “Deployed”.
I went to have a cup of tea. When I came back, the State of my distribution was “Deployed” again. And to http://www.onthescales.com/ now showed me my wesite. Mission accomplished!
The Unfortunately Hacky Icing on the Cake
The final icing on the cake was to get the non-www version – http://onthescales.com – working. Unfortunately, I couldn’t get this to work using S3 and CloudFront alone.
As discussed in this ServerFault answer and the Amazon discussion thread it points to, it’s hard to map the non-WWW version of a domain name to CloudFront, because you can’t CNAME a root name (such as “onthescales.com”) in DNS. This is one of the more irritating limitations of DNS.
Also, you can’t do the normal workaround of using an A record pointing at the destination server’s IP address, because the whole point of the internationally-distributed, load-balancing CloudFront system is that your stuff is on multiple servers, allocated dynamically depending on your visitor’s whereabouts, so pointing your records at a single IP address would be rather dumb.
So, for the non-WWW address, there’s no clean AWS-only solution. The best I could do was to rent some cheap traditional hosting space, and point onthescales.com at that server’s IP address using an A record rather than a CNAME. Then just make that a 301 redirect to www.onthescales.com (I used a .htaccess file.)
It’s a bit nasty, but as far as I can tell, it’s the only way to ensure that the non-WWW version of a site will wind up pointing at CloudFront. At least it should be a fairly minimal expense.
And that’s the final solution that I used. If you go to http://www.onthescales.com, only CloudFront and S3 are involved. You’ll be sent to a CloudFront server that will serve local-to-you cached copies of the files in my www.onthescales.com S3 bucket. Hackily, though, if you go to http://onthescales.com, you’ll hit some cheap GoDaddy hosting that’ll 301-redirect you to http://www.onthescales.com, where CloudFront will take over.
Even with the slightly hacky ending, I think this was still an interesting experiment. If you have a pretty static website, with no server code, especially one with lots of big content – media files, say – then you can definitely now use S3 and CloudFront to do all the hosting and delivery for you, and not bother with setting up your own web server, whether it’s traditional hosting or an Amazon EC2 instance. You can serve directly from your S3 bucket, and it should be very fast, very reliable, and relatively cheap, depending on how much you’re serving.
- There are fourteen pounds to a stone, and I’m rubbish at mental arithmetic, especially when I have to work out remainders when converting back to the British weirdness. ↩