When you visit a web page, you’ll often see the URL change as it loads. For example, if you attempt to visit http://mythic-beasts.com you’ll end up at https://www.mythic-beasts.com . This is achieved using HTTP redirects, a response from a server that tells your browser that the page it is trying to load has moved.
HTTP redirects come in two flavours:
- Permanent (301)
- This tells the client that the page requested has moved permanently, and crucially, if it wants to load the page again, it needn’t bother checking the old URL to see if the situation has changed. This is a good way of redirecting something that you never want to undo, for example, if you’re permanently moving a website from one domain to another.
- Temporary (302)
- As the name suggests, this tells the client that the page has moved, but only temporarily, so the client should continue requesting the old URL if it wants to load the page again. This is a good way of telling users that your site is down for maintenance, that they they don’t have enough credit to access a site, or of some other issue that is likely to change.
Getting this wrong can be a massive pain for your users. For example, Three use a permanent redirect if you’ve run out of credit on your data plan, or you’re trying to use tethering in the wrong country, or some other temporary problem.
So imagine what happens when you run out of data on your plan. You attempt to visit your favourite website, say, http://www.xkcd.com . Three tell you that that page has been replaced by http://tethering.three.co.uk/TetherNoProductPost. Permanently.
Now find a working internet connection, attempt to load http://www.xkcd.com, and find that your browser quite reasonably takes you straight to the Three fail page, even if you’re no longer using a Three connection. Shift+Reload doesn’t help, even restarting your browser may not help.
Three have told your browser that every page you visited whilst out of credit has moved permanently to their fail page.
Expiring permanent redirects
The example given above is very obviously a place where a temporary 302 redirect should be used, but webmasters are often encouraged to prefer 301s in the name of improving search rankings. 301 redirects allow you to tell search engines that your .co.uk site really is the same site as your .com site, thus accumulating all your google juice in the right place. They also save a small amount of time in loading the page by avoiding an unnecessary HTTP request.
Even when used legitimately, 301 redirects are obviously hazardous, as there’s no way to undo a permanent redirect once it’s been cached by a client.
The safe way to do a 301 redirect is to specify that it will expire, even if you don’t expect to ever change it. This can be done using the Cache-Control header. For example, the redirect that we issue for http://mythic-beasts.com includes the following header:
This tells clients that they can remember the redirect for at most one hour, allowing us to change it relatively easily at some point in the future. We use the mod_expires Apache module to create this header, which also produces an equivalent “Expires” header (the old HTTP 1.0 equivalent of Cache-Control).
The above can be implemented using a .htaccess file as follows:
ExpiresDefault "access plus 1 hour"
Redirect 301 / https://www.mythic-beasts.com/
Redirects are often implemented using Apache’s mod_rewrite. Unfortunately, mod_expires doesn’t apply headers to RewriteRules, but mod_header can be used instead:
RewriteRule ^.* http://www.mythic-beasts.com/ [L,R=301,E=limitcache:1]
Header always set Cache-Control "max-age=3600" env=limitcache
The RewriteRule is used to sent an environment variable which is used to conditionally add a Cache-Control header. Thanks to Mark Kolich’s blog for the inspiration.
Escaping 301 hell
Fortunately, if you’re unlucky enough to get caught by a broken 301 redirect, such as the one issued by Three, there is an easy way to get to the page you actually wanted: simply append a query string to the end of the URL. For example, http://www.xkcd.com/?foo=bar. Browsers won’t assume that the cached redirect is valid for this new URL and websites will almost always ignore unexpected query parameters.
2015-07-03 – Updated to add mod_rewrite example