I'm in the school of thought that thinks speakers of languages other than English shouldn't be given a second-rate website experience when the translations and technology for a first-rate experience all already exist. The setup I'm describing here will allow you to provide a localized experience that is as seamless as possible even if you only have part of the website translated. It was used successfully with English, French, and Japanese on an early version of the Mozilla Developer Center before they switched to a wiki.
There are numerous advantages to this scheme:
Accept-Language
headers.
All in all, it's better than the localization support in any content management system I've encountered. Which is a really sad state of affairs in the CMS world because it's so easy with static files. :)
What this scheme doesn't do:
You will need to set the following options:
AddDefaultCharset UTF-8 LanguagePriority en ForceLanguagePriority Prefer Fallback Options +MultiViews DirectoryIndex index
Apache ships with common ones pre-set, but you may also need to set some AddLanguage directives.
When creating your files you will also need to observe the following rules:
/foo
, only ../../foo
.
.LANGCODE.html
extension instead of
a simple .html
extension, where LANGCODE
is en
,
fr
,
ja
,
etc.
../foo/bar
instead of ../foo/bar.en.html
.
This will direct the server to pick the most appropriate language
available, based on the visitor's Accept-Language
headers.
It will fall back to English if no version in the visitor's preferred
languages is found.
(You can change the fallback language, or even specify multiple ones,
by changing the value of the
LanguagePriority
directive listed above.)
The first three rules are not optional. Only break the fourth if you know what you're doing and have adjusted the server settings accordingly.
Sometimes a visitor will want to view articles in a language different from their browser preferences. Maybe they're borrowing a computer, or working on a translation. Apache has an environment variable that can capture this preference, and with the RewriteEngine you can set it based on a string in the request URI.
Create a top-level directory named after the language code for each
language you plan to support and place in each a copy of the following
.htaccess
file, with LANGCODE changed to match
the language code of that directory:
RewriteEngine On RewriteRule ^(.*) ../$1 [E=prefer-language:LANGCODE]
For example, if I'm providing Japanese, I would create a top-level
directory named ja
and place in it the following
.htaccess
file:
RewriteEngine On RewriteRule ^(.*) ../$1 [E=prefer-language:ja]
This will make a request to http://www.example.com/ja/foo/
internally fetch http://www.example.com/foo/index.ja.html
overriding the visitor's Accept-Language
header preferences.
Unlike with a mechanically split site, if a page under the /ja/ virtual
directory does not exist in Japanese, the server will fall back to
language negotiation again. But—here's where the relative links
requirement comes in—all the links are still going to the Japanese
versions of other pages on the site. So when the user clicks a link on
an English fallback page returned through the Japanese section, it's
still going be requesting the Japanese version of the linked page.
So again, never link with the file extension and always
use relative links. If you're targetting a specific language, use
the ../../ja/foo/bar
form, not the /foo/bar.ja.html
form.
lang
attribute on <html>
and then made
the templates switch on that.
alt
text is the corresponding Unicode.