Seamless Localization with Static Files on Apache 2.0.47+

The Benefits

I'm in the school of thought that thinks speakers of languages other than English shouldn't be given a second-rate website experience when the translations and technology for a first-rate experience all already exist. The setup I'm describing here will allow you to provide a localized experience that is as seamless as possible even if you only have part of the website translated. It was used successfully with English, French, and Japanese on an early version of the Mozilla Developer Center before they switched to a wiki.

There are numerous advantages to this scheme:

All in all, it's better than the localization support in any content management system I've encountered. Which is a really sad state of affairs in the CMS world because it's so easy with static files. :)

What this scheme doesn't do:

The Method

Basic Language Negotiation with Fallbacks

You will need to set the following options:

AddDefaultCharset UTF-8
LanguagePriority en
ForceLanguagePriority Prefer Fallback
Options +MultiViews
DirectoryIndex index

Apache ships with common ones pre-set, but you may also need to set some AddLanguage directives.

When creating your files you will also need to observe the following rules:

This will direct the server to pick the most appropriate language available, based on the visitor's Accept-Language headers. It will fall back to English if no version in the visitor's preferred languages is found. (You can change the fallback language, or even specify multiple ones, by changing the value of the LanguagePriority directive listed above.)

The first three rules are not optional. Only break the fourth if you know what you're doing and have adjusted the server settings accordingly.

Language Overrides with Fallbacks

Sometimes a visitor will want to view articles in a language different from their browser preferences. Maybe they're borrowing a computer, or working on a translation. Apache has an environment variable that can capture this preference, and with the RewriteEngine you can set it based on a string in the request URI.

Create a top-level directory named after the language code for each language you plan to support and place in each a copy of the following .htaccess file, with LANGCODE changed to match the language code of that directory:

RewriteEngine On
RewriteRule   ^(.*)    ../$1 [E=prefer-language:LANGCODE]

For example, if I'm providing Japanese, I would create a top-level directory named ja and place in it the following .htaccess file:

RewriteEngine On
RewriteRule   ^(.*)    ../$1 [E=prefer-language:ja]

This will make a request to http://www.example.com/ja/foo/ internally fetch http://www.example.com/foo/index.ja.html overriding the visitor's Accept-Language header preferences. Unlike with a mechanically split site, if a page under the /ja/ virtual directory does not exist in Japanese, the server will fall back to language negotiation again. But—here's where the relative links requirement comes in—all the links are still going to the Japanese versions of other pages on the site. So when the user clicks a link on an English fallback page returned through the Japanese section, it's still going be requesting the Japanese version of the linked page.

So again, never link with the file extension and always use relative links. If you're targetting a specific language, use the ../../ja/foo/bar form, not the /foo/bar.ja.html form.

Tips for an International Website