i18n: A Brief Primer of Web Internationalization
i18n: A Brief Primer of Web Internationalization
by Elika J. Etemad aka fantasai
W3C CSS Working Group Invited Expert
i18n = internationalization
the process of enabling software to support
multiple languages and locales
translation ≠ localization
Bienvenue! Welcome! Wilkommen!
$ € £ M/D/Y D/M/Y °C °F
Localization
- language(s)
- currency & payment systems
- laws
- geography
- culture
language ≠ locale
- Translation guided, not limited, by locale
- Give explicit language choices
- Negotation using HTTP Request-Language
Use UTF-8
- every language
- every locale
HTML <meta charset=utf-8>
or HTTP Content-Type: text/html;charset=utf-8
Styling an Internationalized Website
- Typography
- Bidirectionality
- Sizing
Declare the Language
- Using HTML
lang
attributes
- Using HTTP
Content-Language
Automatic Language Tailorings in CSS
- Line breaking
- Hyphenation
- OpenType glyph choice
- Underline position
- Ruby annotation position
Language-based Selectors
:lang(fr) { ... CSS for elements en français ... }
:lang(\*-CH) { ... CSS for elements in Swiss dialects ... }
:lang(\*-Latn) { ... CSS for elements in Latin transcription ... }
[lang]:lang(fr) { ... CSS to inherit from language root ... }
Bidirectionality
→ What Happens When Writing Directions Collide?
← چه میشود هنگامیکه روشهای نگارش با هم برخورد دارند؟
- Mixing LTR/RTL text
Today we watched a presentation
1—————————————————————————————►
by Elika Etemad (الیکا اعتماد) on
2—————————————► ◄——————————3 4►
the bidi algorithm.
5—————————————————►
- RTL with numbers
ال ۱۳۹۱ مان ایران نرفتم.
◄——————————————3 2——► ◄1
As soon as you expect to have any RTL, you must deal with bidi.
Introduction to Unicode Bidirectional Algorithm
Text is stored in logical order, not visual order.
by Elika Etemad (الیکا اعتماد) on bidi.
1 2 3 5 4 6 7
by Elika Etemad (الیکا
1 2 3 4
اعتماد) on bidi.
5 6 7
Visual order depends on line-breaking.
by Elika Etemad (الیکا اعتماد) on bidi.
1 2 3 5 4 6 7
by Elika Etemad (الیکا
1 2 3 4
اعتماد) on bidi.
5 6 7
The (Vastly Simplified) Unicode Bidirectional Algorithm
- Classify neutrals
- Calculate contiguous runs
- Reorder runs
Resolving Neutrals: Intrinsic Directionality
All Unicode characters are either:
- strong left-to-right (L), e.g. Latin, 中文
- strong right-to-left (R), e.g. العربية, עִבְרִית
- neutral (N), e.g. spaces, punctuation
Note: Numbers are special. We'll pretend they're L for now.
Resolving Neutrals: Sandwich Rule
Surrounded neutrals take surrounding directionality.
L N N N L
⟱
L L L L L
Resolving Neutrals: Sandwich Rule Failures
-
If the two sides conflict?
L N N N R
⟱
L ? ? ? R
-
Neutrals at the start/end of a paragraph?
N R R L N
⟱
? R R L ?
Resolving Neutrals: Base Direction
Each paragraph has a base direction.
-
base direction = L
L N N N R
⟱
L L L L R
N R R L N
⟱
L R R L L
-
base direction = R
L N N N R
⟱
L R R R R
N R R L N
⟱
R R R L R
Use the base direction for initial/final/conflicted neutrals.
Wrong Base Direction
|
| Correct Base Direction
|
.This is my sentence
|
⮕
|
This is my sentence.
|
این جمله است.
|
این جمله است.
|
Declaring Directionality
- HTML
dir
attribute (not CSS direction
property)
- Defaults to
ltr
; set rtl
on <html>
for Hebrew, Arabic, etc.
dir
value inherits
- Elements with
dir
create isolation
Bidi Isolation
- Inside unaffected by outside
(acts like an independent paragraph inside)
- Outside unaffected by inside
(acts like a neutral character from outside)
Using Bidi Isolation
- Isolate user input
- Isolate number-punctuation sequences
- Strip trailing spaces in each element
dir=auto
and <bdi>
- Detects direction from first strong character
- Dumb but controllable heuristic
W3C writes the CSS specifications.
W3C مشخصات CSS را می نویسد.
W3C مشخصات CSS را می نویسد.
- Use better info if you have it
Automatic Layout Effects of Direction
- Default alignment
- Ordering of columns (table, multi-col, grid, flex)
- Scrolling direction
- Ordering of text!
Logical Properties & Values
Physical
| Logical
|
text-align: left
| text-align: start
|
margin-left: 2em
| margin-inline-start: 2em
|
Logical Properties & Values
Directionality-based Selectors
:dir(rtl) { ... CSS for rtl elements ... }
:dir(ltr) { ... CSS for ltr elements ... }
Font Size, Line Height
- Density varies ∴ readable font size varies
爱 A
- Stacking varies ∴ readable line height varies
E Ế
Content Length Varies
- Bienvenue!
- Welcome!
- ようこそ!
- 欢迎!
Intrinsic Sizing
auto
= range formula between min-content/max-content
min-content
= largest unbreakable object
max-content
= smallest size without wrapping
Intrinsic Sizing
Intrinsic Sizing
grid-template-columns: max-content max-content minmax(min-content,1fr) min-content
Automatic Sizes
Intrinsic Sizing Adapts to Chinese
CSS i18n Tools
- HTML
dir
and lang
- “Intrinsic Web Design”: content-out × viewport-in
- font-relative units:
em
, rem
, ch
, etc.
- logical properties & values
:dir()
and :lang()
selectors
Further Resources
https://www.w3.org/International/
Get Involved: i18n Task Forces @ W3C
- Arabic
- Chinese
- Ethiopic
- Hebrew
- Indic
- Japanese
- Mongolian
- Southeast Asian
- Tibetan
- ???