Sandstorm static hosting proxying and Not Found handling with nginx

From 2020-05-09

I host all my sites on Sandstorm using Hugo and hugo-sandstorm. This is super convenient for a couple reasons:

  • I don't have to worry about deploys for new or existing sites, a git push takes care of it.
  • I don't need to architect the git push setup, the app does that for me.
  • I can easily run the blog locally for testing.
  • Because of sandcats, each site magically gets a subdomain I can point at, either with a CNAME or a proxy like HAProxy or nginx.

However, there are limitations:

  • No 404 pages.
  • No easy way to use your own HTTPS certificates like Let's Encrypt ones.
  • I want to clamp down on subdomains so www.johnbintz.com redirects properly.

I was solving the latter 2 for a while with HAProxy, but after setting up the site for The Industrious Rabbit and realizing I wanted to shift pages and sections around as the dust settles on this hit new comic, I wanted 404 pages so folks could still find what they were looking for.

Here's the nginx config I eventually came up with. You'll still have to put the site's public ID as a TXT record into your domain name as indicated in your static config setup, and make sure the Host header is sent along correctly so Sandstorm can do the DNS lookup correctly:

snippet.nginx
# redirect http to https without wildcards
server {
  listen 80;
 
  server_name ~^.*\.johnbintz.com$;
 
  return 301 https://johnbintz.com$request_uri;
}
 
# serve subdomainless from sandstorm
server {
  listen 443 ssl;
 
  # let's encrypt certificates go here
 
  server_name ~^johnbintz.com$;
 
  location / {
    proxy_set_header Host johnbintz.com;
    proxy_pass https://the-sandcats-url-sandstorm-static-publishing-gives-you;
    proxy_intercept_errors on;
    error_page 404 /404/;
 }
}

Then, if you're using Hugo, create content/404.md with the contents of your 404 page. In the event of a missing page, the user will be handed the content of this 404 page, and receive an HTTP 404 Not Found header in response, so search engines will do the right thing, and it's way better than the blank Cannot GET /this-page-does-not-exist page you get normally with Sandstorm static publishing.