WD-css3-text-20050627

CSS3 Text Module

W3C Working Draft 14 February 2006

This version:: http://www.w3.org/TR/2005/WD-css3-text-20050627/
Latest version:: http://www.w3.org/TR/css3-text/
Previous version:: http://www.w3.org/TR/2003/CR-css3-text-20030514/
Editor:: Elika J. Etemad; Paul Nelson (Microsoft)
Previous Editor:: Michel Suignard (Microsoft)

Abstract

This CSS3 module defines properties for text manipulation and specifies their processing model. It covers line breaking, justification and alignment, white space handling, text decoration and text transformation.

Status of this document aka Please Don't Panic

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This Text Effects module and a separate (upcoming) Text Layout module replace and obsolete the May 2003 CSS3 Text Module Candidate Recommendation. Since this is a thorough overhaul of the previous version, a list of changes has been provided instead of a diff.

This document is a Working Draft, and it is still very incomplete. In fact, the majority of its sections have not been added in. Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than a work in progress. Feedback on this draft should be posted to the www-style@w3.org mailing list with [CSS3 Text] in the subject line. You are strongly encouraged to complain if you see something stupid in this draft. I will do my best to respond to all feedback.

If you have implemented properties from CSS3 Text CR please let me know so I can take that into account as I redraft the spec. You can post to www-style (public), post to the CSS WG mailing list (Member-restricted), or email fantasai directly (personal).

This document has been produced as a combined effort of the W3C Internationalization Activity, and the Style Activity and is maintained by the CSS Working Group. It also includes contributions made by participants in the XSL Working Group (members only). Patent disclosures relevant to CSS may be found on the Working Group's public patent disclosure page..

The following features are at risk and may be cut from the spec during its CR period: multiple text shadows, the tibetan text justification mode, the 'text-outline' property, the 'skip-spaces' text-line-mode

1. Introduction
2. Conformance
- 2.1. Partial and Experimental Implementations
3. White Space Processing
4. Line Breaking and Word Boundaries
- 4.1. Line Breaking Restrictions: the 'word-break' property
- 4.2. Hyphenation: the 'hyphenate' property
5. Text Wrapping
- 5.1. Text Wrap Settings: the 'text-wrap' property
  - 5.1.1. Example of using 'text-wrap: suppress' in presenting a footer
- 5.2. Force Wrapping: the 'word-wrap' property
6. Alignment and Justification
7. Spacing
8. Text Decoration
9. Edge Effects
- 9.1 First Line Indentation: the 'text-indent' property
Changes from the May 2003 CSS3 Text CR
Changes from the June 2005 CSS3 Text WD
Acknowledgements
12. References
- 12.1. Normative References
- 12.2. Informative References

1. Introduction

[document here]

2. Conformance

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 (see [RFC2119]). However, for readability, these words do not typically appear in all uppercase letters in this specification.

Additional key words, e.g. "User agent (UA)", are defined by CSS 2.1 ([CSS21], section 3.1).

2.1. Partial and Experimental Implementations

UAs must treat as invalid any properties or values they do not support. Experimental implementations of a feature should support only a vendor-prefixed syntax for the property/value.

3. White Space Processing

White space processing in CSS interprets white space characters for rendering: it has no effect on the underlying document data. In the context of CSS, the document white space set is defined to be any space characters (Unicode value U+0020), tab characters (U+0009), or line break characters (defined by the document format: typically line feed, U+000A). Control characters besides the white space characters and the bidi formatting characters (U+202x) are treated as normal characters and rendered according to the same rules.

The document parser must normalize line break character sequences according to its own format rules before CSS processing takes effect. However, in generated content strings the line feed character (U+000A) and only the line feed character is considered a line break sequence. For CSS white space processing all line breaks must be normalized to a single character representation—usually the line feed character (U+000A)—here called a "line break". This way, all recognized line breaks are treated the same and style rules behave consistently across systems.

The document parser may have not only normalized line break characters, but also collapsed other space characters or otherwise processed white space according to markup rules. Because CSS processing occurs after the parsing stage, it is not possible to restore these characters for styling. Therefore, some of the behavior specified below can be affected by these limitations and may be user agent dependent.

3.1. White Space Collapsing: the 'white-space-collapse' property

Name:	white-space-collapse
Value:	preserve \| collapse \| preserve-breaks \| discard
Initial:	collapse
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property declares whether and how white space inside the element is collapsed. Values have the following meanings, which must be interpreted according to the white space processing rules:

collapse: This value directs user agents to collapse sequences of white space into a single character (or in some cases, no character).
preserve: This value prevents user agents from collapsing sequences of white space. Line breaks are preserved.
preserve-breaks: This value collapses white space as for 'collapse', but preserves line breaks.
discard: This value directs user agents to "discard" all white space in the element.

3.2. The White Space Processing Rules

Any text that is directly contained inside a block (not inside an inline) is treated as being inside an anonymous inline element.

For each inline (including anonymous inlines), white space characters are handled as follows, ignoring bidi formatting characters as if they were not there:

If 'white-space-collapse' is set to 'collapse' or 'preserve-breaks', white space characters are considered collapsible and are processed by performing the following steps:
1. All non-line-break white space characters immediately following a line break character are removed. (This has the effect of discarding all white space at the start of a line but preserving a trailing space if one exists at the end.)
2. If 'white-space-collapse' is not 'preserve-breaks', line break characters are transformed for rendering according to the line break transformation rules.
3. Every tab (U+0009) is converted to a space (U+0020)
4. Any space (U+0020) following another space (U+0020)—even a space before the inline, if that space is also collapsible—is removed.
If 'white-space-collapse' is set to 'preserve', any sequence of spaces (U+0020) unbroken by an element boundary is treated as a sequence of non-breaking spaces. However, a line breaking opportunity exists at the end of the sequence.
If 'white-space-collapse' is set to 'discard', the first white space character in every white space sequence is converted to a zero width non-joiner (U+200C) and the rest of the sequence is removed.

Then, the entire block is rendered. Inlines are laid out, taking bidi reordering into account, and wrapping as specified by the 'text-wrap' property.

As each line is laid out,

A sequence of collapsible spaces (U+0020) at the beginning of a line is removed.
A tab (U+0009) is rendered as a horizontal shift that lines up the start edge of the next glyph with the next tab stop. Tab stops occur at points that are multiples of 8 times the width of a space (U+0020) rendered in the block's font from the block's starting content edge.
A sequence of collapsible spaces (U+0020) at the end of a line is removed.

3.2.1. Example of bidirectionality with white space collapsing

Consider the following markup fragment, taking special note of spaces (with varied backgrounds and borders for emphasis and identification):

<ltr>A <rtl> B </rtl> C</ltr>

where the <ltr> element represents a left-to-right embedding and the <rtl> element represents a right-to-left embedding. If the 'white-space-collapse' property is set to 'collapse', the above processing model would result in the following:

The space before the B ( ) would collapse with the space after the A ( ).
The space before the C ( ) would collapse with the space after the B ( ).

This would leave two spaces, one after the A in the left-to-right embedding level, and one after the B in the right-to-left embedding level. This is then ordered according to the Unicode bidirectional algorithm, with the end result being:

A  BC

Note that there are two spaces between A and B, and none between B and C. This is best avoided by putting spaces outside the element instead of just inside the opening and closing tags and, where practical, by relying on implicit bidirectionality instead of explicit embedding levels.

3.2.2. Line Break Transformation Rules

When line breaks are collapsible, they are either transformed into a space (U+0020) or removed depending on the script context before and after the line break.

The script context is determined by the Unicode-given script value [UAX24] of the first character that side of the line break. However, characters such as punctuation that belong to the COMMON and INHERITED scripts are ignored in this check; the next character is examined instead. The UA must not examine characters outside the block and may limit its examination to as few as four characters on each side of the line break. If the check fails to find an acceptable script value (i.e. it has hit the check limits), then the script context is neutral.

If the character immediately before or immediately after the line break is the zero width space character (U+200B), then the line break is removed.
Otherwise, if the script context on one side of the line break is Han, Yi, Hiragana, or Katakana and the context on the other side is Han, Yi, Hiragana, Katakana, or neutral, then the line break is removed.
Otherwise, the line break is converted to a space (U+0020).

Comments on how well this would work in practice would be very much appreciated, particularly from people who work with Thai and similar scripts.

3.2.3. Informative Summary of White Space Collapsing Effects

Consecutive white space collapses into a single space.
A sequence of line breaks and other white space between two ideographic characters collapses into nothing unless there is a space before the first line break in the sequence.
A zero width space immediately before or anywhere after a line break causes the entire sequence of white space beginning with the line break to collapse into a zero width space.

3.3. White Space and Text Wrapping Shorthand: the 'white-space' property

Name:	white-space
Value:	normal \| pre \| nowrap \| pre-wrap \| pre-line
Initial:	not defined for shorthand properties
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	see individual properties

The 'white-space' property is a shorthand for the 'white-space-collapse' and 'text-wrap' properties. Not all combinations are represented. Values have the following meanings:

normal: Sets 'white-space-collapse' to 'collapse' and 'text-wrap' to 'normal'
pre: Sets 'white-space-collapse' to 'preserve' and 'text-wrap' to 'none'
nowrap: Sets 'white-space-collapse' to 'collapse' and 'text-wrap' to 'none'
pre-wrap: Sets 'white-space-collapse' to 'preserve' and 'text-wrap' to 'normal'
pre-line: Sets 'white-space-collapse' to 'preserve-breaks' and 'text-wrap' to 'normal'

The following informative table summarizes the behavior of various 'white-space' values:

	New Lines	Spaces and Tabs	Text Wrapping
normal	Collapse	Collapse	Wrap
pre	Preserve	Preserve	No wrap
nowrap	Collapse	Collapse	No wrap
pre-wrap	Preserve	Preserve	Wrap
pre-line	Preserve	Collapse	Wrap

4. Line Breaking and Word Boundaries

In many writing systems, words are always separated by spaces or punctuation. In the absence of a hyphenation dictionary, a line break can occur only at these explicit word boundaries. In Chinese and Japanese typography, however, no spaces nor any other word separating characters are used. In these systems a line can break anywhere except between certain character combinations. Additionally the level of strictness in these restrictions can vary with the typesetting style.

Scripts like Thai, which uses a space to separate clauses rather than to separate words, present another type of line breaking case. The lack of visible word delimiters makes it similar to the CJK systems. However, like English in the absence of a hyphenating dictionary, Thai never breaks inside words. As a result, knowledge of the vocabulary is necessary to be able to correctly break a line of Thai text. To explicitly mark word boundaries, the zero width space (U+200B) can be used as a word delimiter in Thai and similar scripts.

4.1. Line Breaking Restrictions: the 'word-break' property

CSS distinguishes between two levels of strictness in the rules for implicit line breaking in CJK text. The precise set of rules in effect for the strict and loose levels is up to the UA and should follow language conventions. However, this specification does recommend that the following breaks be forbidden in strict line breaking and allowed in loose:

breaks before Japanese small kana
breaks before Japanese iteration marks
???

Information on line breaking conventions can be found in [JIS4051] for Japanese, [标点符号] for Chinese, and [?] for Korean, and in [UAX14].

Name:	word-break
Value:	normal \| keep-all \| loose \| break-strict \| break-all
Initial:	normal
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property specifies what set of line breaking restrictions are in effect within the element. Values have the following meanings:

normal: Breaks non-CJK scripts according to their own rules while using a strict set of line breaking restrictions for CJK scripts (Hangul, Japanese Kana, and CJK ideographs).
keep-all: Same as 'normal' for all non-CJK scripts. However, sequences of CJK characters can no longer break on implied break points. This option should only be used where the presence of white space characters still creates line-breaking opportunities, as in Korean.
loose: As for 'normal', but CJK scripts use a less restrictive set of line-breaking restrictions.
break-strict: Same as 'normal' for CJK scripts, but non-CJK scripts can break anywhere. This option is used mostly when the text is predominantly CJK characters with few non-CJK excerpts and it is desired that the text be more evenly distributed on each line.
break-all: As for 'break-strict', except CJK scripts break according to the rules for 'loose'.

When shaping scripts such as Arabic are allowed to break within words due to 'break-all' or 'break-strict', the characters must still be shaped as if the word were not broken.

4.2. Hyphenation: the 'hyphenate' property

Name:	hyphenate
Value:	none \| auto
Initial:	none
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property determines whether the line-breaking algorithm is allowed to use a hyphenation engine to break within words. Intra-word breaking restrictions have no effect when 'word-break' is 'break-all'. Possible values:

none: No intra-word breaking.
auto: Words may be broken at an appropriate hyphenation point. This requires that the user agent have an hyphenation dictionary for the language of the text being broken.

If hyphenation is applied to a shaped script such as Arabic then the shaping must be done ignoring the hyphenation.

5. Text Wrapping

Text wrapping is controlled by the 'text-wrap' and 'word-wrap' properties:

5.1. Text Wrap Settings: the 'text-wrap' property

Name:	text-wrap
Value:	normal \| unrestricted \| none \| suppress
Initial:	normal
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property specifies the mode for text wrapping. Possible values:

normal: Lines may break at allowed break points, as determined by the line-breaking rules in effect.
none: Lines may not break; text that does not fit within the block box overflows it.
unrestricted: Lines may break between any two grapheme clusters. Line-breaking restrictions have no effect and hyphenation does not take place. Character shaping must ignore the break.
suppress: Line breaking is suppressed within the element: the UA may only break within the element if there are no other acceptable break points in the line. If the text breaks, line-breaking restrictions are honored as for 'normal'.

When 'text-wrap' is set to 'normal' or 'suppress', UAs that allow breaks at punctutation other than spaces should prioritize breakpoints. For example, if breaks after slashes have a lower priority than spaces, the sequence "check /etc" will never break between the '/' and the 'e'. The UA may use the width of the containing block, the text's language, and other factors in assigning priorities.

Example of using 'text-wrap: suppress' in presenting a footer

The priority of breakpoints can be set to reflect the intended grouping of text.

Given the rules

footer { text-wrap: suppress; /* inherits to all descendants */ }

and the following markup:

<footer>
  <venue>27th Internationalization and Unicode Conference</venue>
  &#8226; <date>April 7, 2005</date> &#8226;
  <place>Berlin, Germany</place>
</footer>

In a narrow window the footer could be broken as

27th Internationalization and Unicode Conference •
April 7, 2005 • Berlin, Germany

or in a narrower window as

27th Internationalization and Unicode
Conference • April 7, 2005 •
Berlin, Germany

but not as

27th Internationalization and Unicode Conference • April
7, 2005 • Berlin, Germany

5.2. Force Wrapping: the 'word-wrap' property

Name:	word-wrap
Value:	normal \| break-word
Initial:	normal
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property specifies whether the UA may break within a word to prevent overflow when an otherwise-unbreakable string is too long to fit within the containing block. It only has an effect when 'text-wrap' is either 'normal' or 'suppress'. Possible values:

normal: Lines may break only at allowed break points.
break-word: An unbreakable 'word' may be broken at an arbitrary point if there are no otherwise-acceptable break points in the line. Shaping characters are still shaped as if the word were not broken, and grapheme clusters must together stay as one unit.

6. Alignment and Justification

6.1. Text Alignment: the 'text-align' property

Name:	text-align
Value:	start \| end \| left \| right \| center \| justify \| <string>
Initial:	start
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property describes how inline contents of a block are horizontally aligned. Values have the following meanings:

start: The inline contents are aligned to the start edge of the line box.
end: The inline contents are aligned to the end edge of the line box.
left: The inline contents are aligned to the left edge of the line box. In vertical text, "left" is interpreted with respect to the beginning of the line stack rather than the top of the page.
right: The inline contents are aligned to the right edge of the line box. In vertical text, "right" is interpreted with respect to the beginning of the line stack rather than the top of the page.
center: The inline contents are centered within the line box.
justify: The text is justified according to the method specified by the 'text-justify' property.
<string>: When applied to a table cell, specifies a string on which all cells in its table column that also have a string value for 'text-align' will align (see the section on horizontal alignment in a column for details and an example). When applied to any other element, it is treated as 'start'.

A block of text is a stack of line boxes. In the case of 'start', 'end', 'left', 'right' and 'center', this property specifies how the inline boxes within each line box align with respect to the line box's sides: alignment is not with respect to the viewport. In the case of 'justify', the UA may stretch the inline boxes in addition to adjusting their positions. (See also the 'text-justify', 'text-justify-trim', 'text-kashida-space', 'letter-spacing' and 'word-spacing'.)

6.2. Last Line Alignment: the 'text-align-last' property

Name:	text-align-last
Value:	start \| end \| left \| right \| center \| justify
Initial:	start
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property describes how the last line of a block or a line right before a forced line break is aligned when 'text-align' is set to 'justify'. Values have the same meaning as for 'text-align'.

6.3. Justification Method: the 'text-justify' property

Name:	text-justify
Value:	auto \| inter-word \| inter-ideograph \| inter-character \| inter-cluster \| kashida \| tibetan
Initial:	auto
Applies to:	all elements
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	specified value

This property selects the justification method used when 'text-align' is set to 'justify'. It takes the following values:

auto: The UA determines the justification algorithm to follow, based on a balance between performance and adequate presentation quality.
inter-word: Justification primarily changes spacing at word separators
inter-ideograph: Justification primarily changes spacing at word separators and at inter-graphemic boundaries in scripts that use no word spaces
inter-character: Justification primarily changes spacing both at word separators and at grapheme cluster boundaries in all scripts except those in the connected and cursive groups.
inter-cluster: Justification primarily changes spacing at word separators and at grapheme cluster boundaries in cluster scripts.
kashida: Justification primarily stretches Arabic and related scripts through the use of kashida or other calligraphic elongation.
tibetan: Justification primarily stretches spaces after shad if the line contains any and/or pads the end of the line with tsek marks if the line already ends in one.

When justifying text, the user agent takes the remaining space between the ends of a line's contents and the edges of its line box, and distributes that space throughout its contents so that the contents exactly fill the line box. If the 'letter-spacing' and 'word-spacing' property values allow it, the user agent may also distribute negative space, putting more content on the line than would otherwise fit under normal spacing conditions. The exact justification algorithm is UA-dependent; however, CSS defines some general guidelines which must be followed when any justification method other than 'auto' is specified.

Justification affects different types of writing systems in different ways. For justification purposes, writing systems are grouped as follows:

block: CJK (including Hangul and half-width kana) and by extension all "wide" characters. (See [UAX11])
clustered: South-East Asian scripts that have discrete units but do not use space between words (such as Thai, Lao, Khmer, Myanmar)
connected: Devanagari and other scripts such as such as Bengali and Gurmukhi, that use spaces between words and baseline connectors within words. The Ogham script also falls into this category.
cursive: Arabic and similar cursive scripts
discrete: Scripts that use spaces between words and have discrete, unconnected (in print) units within words, such as Latin, Greek, Cyrillic, Hebrew; this category also includes symbols and punctuation.
tibetan: Tibetan has clusters similar to South and South Asian scripts but also has its own punctuation system. Its traditional justification does not match any of the other scripts, so this category represents the Tibetan script.

Where do scripts like Tamil fit in?

The UA may enable or break optional ligatures or use other font features such as alternate glyphs to help justify the text under any method. This behavior is not defined by CSS.

CSS defines flex points as points where the justification algorithm may alter spacing within the text. Flex points occur at word separators and between grapheme clusters. These flex points fall into priority levels as defined by the justification method. Within a line, higher priority flex points must be expanded or compressed to their limits before later priority flex points may be adjusted. These limits are given by the the letter-spacing and word spacing properties. How any remaining space is distributed once all flex points reach their limits is up to the UA. If the inline contents of a line cannot be stretched to the full width of the line box, then they must be aligned as specified by the 'text-align-last' property (or as 'start' if 'text-align-last' is 'justify').

The flex point priorities for values of 'text-justify' are given in the table below. Space must be distributed evenly among all types of flex points in a given prioritization group, but may vary within a line due to changes in the font or letter-spacing and word-spacing values. The different types of flex points are defined as follows:

spaces: A flex point exists at spaces and other visible word separators. Expand as for 'word-spacing'.
block
clustered
connected
discrete
tibetan: A flex point exists between two grapheme clusters, when at least one of them belongs to the affected script group and the spacing that point has not already been altered at a higher priority.
I'm not sure grapheme clusters are the right unit to use for some of these complex scripts...
cursive: If the UA is capable of extending the graphical connection between cursively connected grapheme clusters, a flex point exists between any two cursively connected grapheme clusters belonging to cursive scripts, but not between disjoint grapheme clusters. The UA must not break the graphical connection when changing the spacing between cursively connected grapheme clusters.

method:	inter-word		inter-ideograph		inter-character		inter-cluster		kashida			tibetan			auto
priority:	1st	2nd	1st	2nd	1st	2nd	1st	2nd	1st	2nd	3rd	1st	2nd	3rd	-
special									•			•			?
spaces	•		•		•		•			•			•
discrete		•		•	•			•			•			•
block		•	•		•			•			•		•
clustered		•		•	•		•			•			•
connected		•		•		•		•			•			•
cursive		•		•		•		•						•
tibetan		•		•	•		•				•			•

The two values kashida and tibetan trigger special justification behavior as specified below. This special behavior takes priority over the flex point adjustment described above.

kashida: apply kashida elongation. This may be done in discrete kashida units, and the prioritization of kashida points is UA-dependent: for example, the UA may apply more at the end of the line. The UA should not apply kashida to fonts for which it is inappropriate. It may instead rely on other justification methods that lengthen or shorten Arabic segments (e.g. by substituting in swash forms or optional ligatures). Because elongation rules depend on the typeface style, the UA should rely on on the font whenever possible rather than inserting kashida based on a font-independent ruleset. The UA should limit elongation so that in multi-script lines a short stretch of Arabic will not be forced to soak up too much of the extra space by itself.
tibetan: apply Tibetan justification. Tibetan justification stretches the spaces after shads (TIBETAN MARK SHAD U+0F0D, TIBETAN MARK NYIS SHAD U+0F0E, TIBETAN MARK TSHEG SHAD U+0F0F, TIBETAN MARK NYIS TSHEG SHAD U+0F10, TIBETAN MARK RIN CHEN SPUNGS SHAD U+0F11, TIBETAN MARK RGYA GRAM SHAD U+0F12) if the line contains any. Otherwise, if the line ends in a tsek mark (TIBETAN MARK INTERSYLLABIC TSHEG U+0F0B, TIBETAN MARK DELIMITER TSHEG BSTAR U+0F0C) it pads the end of the line with tsek (U+0F0B) to fill the remaining space. The UA may use a more sophisticated algorithm than the simple two-step one above, but must still prioritize flexing a space after a shad over padding the line with tseks. For example, one possible algorithm stretches spaces up to twice their width, then falls back to padding the line with up to six tseks. If there is still more space to soak up, it goes back to stretching the spaces beyond that limit, or, if there are none, adding tseks past the six-tsek limit. Balancing the two methods in such a way results in more even-looking justified text.

7. Spacing

The next two properties refer to the <spacing-limit> value type, which is defined as follows:

<spacing-limit>

[normal | <length> | <percentage>] {1,3}

If only two values are specified, the third is assumed to be the same as the second. If only one value is specified, all three values are the same. The first spacing value specifies the desired (optimum) spacing. The second value specifies the desired minimum spacing limit, and the third specifies the desired maximum spacing limit.

If the minimum spacing value is greater than the optimum spacing value, then the used minimum spacing value is the optimum spacing value. If the maximum spacing value is less than the optimum spacing value, then the used maximum spacing value is the optimum spacing value.This substitution occurs after inheritance.

normal

Specifies the normal optimum/minimum/maximum spacing, as defined by the current font and/or the user agent. Normal spacing should be percentage-based and normal minimum and maximum spacing should vary according to some measure of the amount of text on a line (e.g. block width divided by font size): larger measures can accomodate tighter spacing constraints. Normal spacing may also vary based on the value of the 'text-justify' property.

<length> or <percentage>

Specifies extra spacing in addition to the normal spacing. Percentages are with respect to the width of a space (U+0020). Values may be negative, but there may be implementation-dependent limits.

7.1. Word Spacing: the 'word-spacing' property

Name:	word-spacing
Value:	<spacing-limit>
Initial:	normal
Applies to:	all elements
Inherited:	yes
Percentages:	refers to width of space (U+0020) glyph
Media:	visual
Computed value:	'normal' or computed value or percentage

This property specifies the minimum, maximum, and optimal spacing between words. In the absence of justification the optimal spacing must be used. The text justification process may alter the spacing from its optimum (see the 'text-justify' property, above) but must not violate the minimum spacing limit and should also avoid exceeding the maximum.

Spacing is applied to each word-separator character left in the text after the white space processing rules have been applied and should be applied half on each side of the character. This is correct for Ethiopian and doesn't matter for invisible spaces, but is it correct for Tibetan? Most publications seem to add space after the tsek mark during justification. Word-separator characters include the space (U+0020), the no-break space (U+00A0), the Ethiopic word space (U+1361), the ideographic space (U+3000), the Aegean word separators (U+10100,U+10101), the Ugaritic word divider (U+1039F), and the Tibetan tsek (U+0F0B, U+0F0C). Is this list correct? If there are no word-separator characters, or if the word-separating character has a zero advance width (such as the zero width space U+200B) then the user agent must not create an additional spacing between words. General punctuation and fixed-width spaces are not considered word-separators.

7.2. Tracking: the 'letter-spacing' property

Name:	letter-spacing
Value:	<spacing-limit>
Initial:	normal
Applies to:	all elements
Inherited:	yes
Percentages:	refers to width of space (U+0020) glyph
Media:	visual
Computed value:	'normal' or computed value or percentage

This property specifies the minimum, maximum, and optimal spacing between grapheme clusters. In the absence of justification the optimal spacing must be used. The text justification process may alter the spacing from its optimum (see the 'text-justify' property, above) but must not violate the minimum spacing limit and should also avoid exceeding the maximum. Letter-spacing is applied in addition to any word-spacing.

A grapheme cluster is what a language user considers to be a character or a basic unit of the script. The term is described in detail in the Unicode Technical Report: Text Boundaries [UAX29]. This specification relies on the default (not tailored) rules only.

Spacing must not be applied at the beginning or at the end of a line. At element boundaries, the letter spacing is given by and rendered within the innermost element that contains the boundary. For example, given the markup

<P>a<LS>b<Z>cd</Z><Y>ef</Y></LS>g</P>

and the style sheet

LS { letter-spacing: 1em; }
Z { letter-spacing: 0.3em; }
Y { letter-spacing: 0.4em; }

the spacing would be

a[0]b[1em]c[0.3em]d[1em]e[0.4em]f[0]g

UAs may apply letter-spacing to cursive scripts. In this case, UAs must extend the space between disjoint graphemes as specified above and extend the visible connection between cursively connected graphemes by the same amount (rather than leaving a gap). If the UA cannot expand a cursive script this way, it must not apply letter-spacing between grapheme clusters of that script at all.

UAs must not apply letter-spacing to connected scripts. Or should they?

When the resulting space between two characters is not the same as the default space, user agents should not use optional ligatures.

7.3. Kashida Elongation: the 'text-kashida-space' property

Put something here. What sort of settings are needed?

8. Text Decoration

8.1 Line Decoration

Put text from CSS2.1 here once http://lists.w3.org/Archives/Public/www-style/2003Nov/0021.html has been addressed.

8.1.1 Underline, Overline, Strike-through, and Blink: the 'text-decoration' property

Name:	text-decoration
Value:	none \| [ underline \|\| overline \|\| line-through \|\| blink ]
Initial:	none
Applies to:	all elements and generated content
Inherited:	no (see prose)
Percentages:	N/A
Media:	visual
Computed value:	as specified

This property specifies what decorations are added to the text of an element.

none: Produces no text decoration.
underline: Each line of text is underlined.
overline: Each line of text has a line above it.
line-through: Each line of text has a line through the middle.
blink: Text blinks (alternates between visible and invisible). Conforming user agents may simply not blink the text.

8.1.2 Line Decoration Color: the 'text-line-color' property

Name:	text-line-color
Value:	<color>
Initial:	currentColor
Applies to:	all elements and generated content
Inherited:	no
Percentages:	N/A
Media:	visual
Computed value:	a color

This property specifies the line color for underline, line-through and overline text decorations applied to the element. The color of the decoration must remain the same across descendants even if descendant elements have different 'color' or 'text-line-color' values.

8.1.3 Line Decoration Style: the 'text-line-style' property

Name:	text-line-style
Value:	[ solid \| double \| dotted \| dashed \| dot-dash \| dot-dot-dash \| wave ] \|\| thick
Initial:	solid
Applies to:	all elements and generated content
Inherited:	no
Percentages:	N/A
Media:	visual
Computed value:	as specified

This property specifies the line style for underline, line-through and overline text decorations applied to the element. The style of the decoration must remain the same across descendants even if descendant elements have different 'text-line-style' values. Values have the following meanings:

solid: Produces a solid line.
double: Produces a double line.
dotted: Produces a dotted line.
dashed: Produces a dashed line.
dot-dash: Produces a line whose repeating pattern is a dot followed by a dash.
dot-dot-dash: Produces a line whose repeating pattern is two dots followed by a dash.
wave: Produces a wavy line.

The thick keyword specifies an underline thickness that is thicker (typically one-third to twice as thick) than the normal style. If it appears without a style keyword, 'solid' is assumed.

The following figure demonstrates the appearance of these various line styles.

8.1.4 Line Continuity Mode: the 'text-line-mode' property

Name:	text-line-mode
Value:	continuous \| [ skip-spaces \|\| skip-images ]
Initial:	continuous
Applies to:	all elements and generated content
Inherited:	no
Percentages:	N/A
Media:	visual
Computed value:	as specified

This property determines whether text decoration is drawn through white space and replaced content. Values have the following meanings:

continuous: The element's text decoration is drawn through all content as specified above.
skip-spaces: The element's text decoration skips white space in decorated text.
skip-images: The element's text decoration skips descendant replaced elements.

8.1.5 Underline Position: the 'text-underline-position' property

8.2 Emphasis Marks: the 'text-emphasis' property

Name:	text-emphasis
Value:	none \| [ [ accent \| dot \| circle \| disc] [ before \| after ]? ]
Initial:	none
Applies to:	all elements and generated content
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	as specified

East Asian documents use small symbols on top of each glyph to emphasize a run of text. For example:

Accent emphasis (shown in blue for clarity) applied to Japanese text

This property applies emphasis formatting applied to text. Unlike 'text-decoration', emphasis marks can affect the line height. Values have the following meanings:

none: No emphasis marks.
accent: Draw calligraphic accent strokes as marks.
dot: Draw calligraphic dots as marks.
circle: Draw hollow circles as marks.
disc: Draw filled circles as marks.
before: Draw marks above the text in horizontal layout, to the right in vertical layout. This is the default position.
after: Draw marks below the text in horizontal layout, to the left in vertical layout.

The list of shapes here is copied from the CSS3 Fonts module drafts, and it is not correct or at least not complete. Any input on what shapes are needed, what usage patterns are found in real texts, etc. would be much appreciated. Send them off to www-style@w3.org or www-international@w3.org with [CSS3 Text] in the subject line.

The preferred position of emphasis marks depends on the language. In Japanese for example, the preferred position is 'before'. In Chinese used in the PRC, on the other hand, the preferred position is 'after'. The informative table below summarizes the preferred emphasis mark position for Chinese and Japanese:

Preferred emphasis mark and ruby position
Language	Preferred mark position	Illustration
Japanese	before
Chinese (Traditional)	before
Chinese (Simplified)	after

8.3 Text Shadows: the 'text-shadow' property

Name:	text-shadow
Value:	none \| [`<shadow>`, ] * `<shadow>`
Initial:	none
Applies to:	all elements and generated content
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	a color plus three absolute <length>s

This property accepts a comma-separated list of shadow effects to be applied to the text of the element. <shadow> is defined as [ <color>? <length> <length> <length>? | <length> <length> <length>? <color>? ], where the first two lengths represent the offset and the third an optional blur radius. The shadow is applied to all of the element's text as well as any text decoration applied to it. Would it be better to apply shadows together with text decoration: i.e. a descendent of an underlined element doesn't apply shadow to its underline, but the underlining element, if it has shadows, would apply it to the underline of all text it underlines. When a text outline is specified, the shadow shadows the outlined shape rather than the glyph shape.

The shadow offset is specified with two <length> values that indicate an offset from direct alignment with the text. The first length value specifies the horizontal distance to the right of the text. A negative horizontal length value places the shadow to the left of the text. The second length value specifies the vertical distance below the text. A negative vertical length value places the shadow above the text.

A blur radius may optionally be specified after the shadow offset. The blur radius is a length value that indicates the boundaries of the blur effect. The exact algorithm for computing the blur effect is not specified. If the blur radius is not specified, it is equal to zero.

A color value may optionally be specified before or after the length values of the shadow effect. The color value will be used as the color of the shadow effect. If the color is not specified, a UA-chosen color will be used.

The shadow effects are applied in the order specified should this be changed to layer the same way multiple backgrounds do (earlier on top)? and may thus overlay each other, but they will never overlay the text itself. Shadow effects do not alter the size of a box, but may extend beyond its boundaries. The stack level of the shadow effects is the same as for the element itself. Does this definition cause problems with the shadow of one element painting over the text of the previous element? How would we solve that?

Text shadows may be used with the :first-letter and :first-line pseudo-elements.

8.4 Text Outlines: the 'text-outline' property

Name:	text-outline
Value:	none \| [ <color> <length> <length>? \| <length> <length>? <color> ]
Initial:	none
Applies to:	all elements and generated content
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	a color plus two absolute <length>s

This property specifies a text outline where the first length represents the outline's thickness and the second represents an optional blur radius. The outline never overlays the text itself. Its effect is the same as that obtained by applying text shadows in every radial direction, i.e. all text shadows whose offsets satisfy the equation x² + y² = thickness².

The Timed-Text WG had suggestions for some keywords (text-outline: normal|heavy|light;) as well as a <length> thickness. Should these be added? How would they be defined? (Maybe use (thin|medium|thick) as in border-width?)

The blur radius is a length value that indicates the boundaries of the blur effect. The exact algorithm for computing the blur effect is not specified, but it is only applied to the outer edge of the outline. If the blur radius is not specified, it is equal to zero. Is a second blur radius needed for the inner edge? Or should the blur apply to both edges? Implementations may choose to ignore the blur radius when text outline is combined with a text shadow.

A color value must be specified before or after the length values of the outline effect. The color value will be used as the color of the outline.

Text outlines may be used with the :first-letter and :first-line pseudo-elements.

9. Edge Effects

9.1 First Line Indentation: the 'text-indent' property

Name:	text-indent
Value:	[ <length> \| <percentage> ] hanging?
Initial:	0
Applies to:	block-level, inline-block elements and table cells
Inherited:	yes
Percentages:	refers to width of containing block
Media:	visual
Computed value:	the percentage as specified or the absolute length

This property specifies the indentation applied to lines of inline content in a block. The indendation only affects the first line of inline content in the block unless the 'hanging' keyword is specified, in which case it affects all lines except the first.

The indent is treated as a margin applied to the start edge of the line box. The amount of indentation is given by the length or percentage value. Percentages are relative to the containing block, even in the presence of floats. They are inherited as percentages, not as absolute lengths.

If 'text-align' is 'start' and 'text-indent' is '5em' in left-to-right text with no floats present, then first line of text will start 5em into the block:

     Since CSS1 it has been possible
<--    -->to indent the first line of a block
<--    -->element using the 'text-indent'
<--    -->property.

The original proposal for this syntax included a dependence on alignment, written as follows and with the following extra example:

The indent is treated as a margin applied to the alignment edge of the line box. The alignment edge depends on the value of 'text-align'. It is:

the start edge of the line box if 'text-align' is a string value, 'start', or 'justify'.
the end edge of the line box if 'text-align' is 'end'.
the left edge of the line box if 'text-align' is 'left'.
the right edge of the line box if 'text-align' is 'right'.

If 'text-align' is 'center', then the indentation is split evenly between both ends of the line box.

Example. The following rule would cause text to be centered in its block, with the first line allowed to fill the block's width but subsequent lines restricted to only half of it:
p { text-align: center: text-indent: 50% hanging; border: dashed; }
 Since CSS1 it has been possible to
          indent the first
          line of a block
         element using the
           'text-indent'
             property.

None of the web browsers tested at the time (WinIE6, Nav7 and Opera7) displayed this behavior, so Michel decided to keep the indentation on the start edge always. However, perhaps it would be useful to adopt the centered indentation for text-align: center; a start-edge indentation is pretty useless on centered text (it just looks weird), and I don't imagine many authors would be relying on that behavior.

Since the 'text-indent' property inherits, when specified on a block element, it will affect descendent inline-block elements. For this reason, it is often wise to specify 'text-indent: 0' on elements that are specified 'display: inline-block'.

9.2 Hanging Punctuation: the 'hanging-punctuation' property

Name:	hanging-punctuation
Value:	none \| [ start \|\| end \|\| end-edge ]
Initial:	none
Applies to:	block-level, inline-block elements and table cells
Inherited:	yes
Percentages:	N/A
Media:	visual
Computed value:	as specified

This property determines whether a punctuation mark, if one is present, may be placed outside the line box at the start or at the end of a full line of text. If a justified line can fit the punctuation will it expand to push it outside the content area? No. What if the line ends in multiple punctuation marks? Which punctuation marks are affected? Values have the following meanings:

start: Punctuation may hang outside the start edge of the first line.
end: Punctuation may hang outside the end edge of the last line.
end-edge: Punctuation may hang outside the end edge of all lines.

Need to work on the description. Cover indentation as well.

To be continued...

10. Changes from the May 2003 CSS3 Text CR

Much of the text has been rewritten or severely revised, so all changes will not be listed here. Highlights include:

The 'line-break' and 'word-break-cjk' properties have been replaced by the 'word-break' property.
The 'word-break-inside' property has been replaced by the 'hyphenate' property.
The 'wrap-option' property has been replaced by the 'text-wrap' and 'word-break' property.
The 'linefeed-treatment', 'white-space-treatment', and 'all-space-treatment' properties have been replaced by the 'white-space-collapse' property.
The 'min-font-size' and 'max-font-size' properties have been delegated to the next revision of the CSS3 Fonts module.
The 'size' value has been moved from the 'text-align-last' property to the 'text-justify' property.
The 'distribute' value for 'text-justify' has been renamed to 'inter-character'.
The 'newspaper' value for 'text-justify' has been dropped in favor of using minimum and maximum limits set on the 'word-spacing' and 'letter-spacing' properties to guide justification.
The 'word-spacing' and 'letter-spacing' now take percentage values.
The new 'text-wrap' property's 'suppress' value allows authors to suppress text wrapping within an element with respect to its surrounding text without forbidding correctly-restricted breaks when they are needed.
The automatic line break conversions specified in the white space processing rules have been changed to try not to break existing East and Southeast Asian content. Explicit control of these rules through the 'linefeed-treatment' property has been removed because style sheets should not be expected to adapt to the source code formatting style.
The 'size' value for text-align-last has been removed.
Text justification has been much more explicitly specified with prioritized flex points.
The 'tibetan' value for 'text-justify' has been added to specify traditional Tibetan justification (which is still in use in modern books and newspapers) and details have been added for handling Tibetan text in other justification schemes.
The 'text-shadow' property now inherits, which makes more sense and is consistent with recent implementation. The default color is now UA-specified, which is more reasonable than defaulting to the current text color. The definition now also mentions that it applies to text decoration and makes a few minor clarifications.
The 'text-outline' property has been added in response to feedback from the Timed-Text WG and Ada Chan. Its design derives from a discussion of requirements within the CSS Working Group.
The definition for 'text-indent' has been replaced with the precise text sent in by Ian Hickson and fantasai on 8 March 2003 (minus the 'text-align' dependence rules).

Many sections intended for this module are not yet represented in this draft. In particular, the 'text-justify-trim', 'text-overflow', 'text-decoration', 'text-transformation', 'punctuation-trim', 'text-autospace', 'text-shadow', 'hanging-punctuation', 'kerning-mode', and related properties have not yet been evaulated.

Sections relating to text layout (vertical text, grids, 'text-combine') will be moved to a separate Text Layout module. These features may change greatly from the last revision, but they have not been dropped. The vertical text feature, for example, will likely be based on the methods described in Unicode Technical Note #22.

11. Changes from the June 2005 CSS3 Text WD

Update references
Fix syntax definition for <spacing-limit> and move min/max wording
Allow and specify letter-spacing expansion for cursive scripts
Recommend that 'normal' min/max limits for letter-spacing and word-spacing vary based on the line length; also allow them to change based on text-justify.
Re-add definition for 'white-space' shorthand property
Add Tibetan justification
Remove 'size' justification value
Improve justification text
Add back definition for 'text-indent', with more precise (and concise) text
Add back definition for 'text-shadow', make it inherit and let UA pick default color.
Add 'text-outline' property.

12. Acknowledgements

This specification would not have been possible without the help from: Ayman Aldahleh, Bert Bos, Tantek Çelik, Stephen Deach, Martin Dürst, Laurie Anna Edlund, Ben Errez, Yaniv Feinberg, Arye Gittelman, Ian Hickson, Martin Heijdra, Richard Ishida, Koji Ishii, Masayasu Ishikawa, Michael Jochimsen, Eric LeVine, Chris Lilley, Paul Nelson, Chris Pratley, Martin Sawicki, Rahul Sonnad, Frank Tang, Chris Thrasher, Etan Wexler, Chris Wilson, Masafumi Yabe and Steve Zilles.

13. References

13.1. Normative References

[CSS21]: Bert Bos; et al. Cascading Style Sheets, level 2 revision 1. 25 February 2004. W3C Candidate Recommendation. (Work in progress.) URL: http://www.w3.org/TR/2004/CR-CSS21-20040225
[RFC2119]: S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. Internet RFC 2119.URL: http://www.ietf.org/rfc/rfc2119.txt
[UAX11]: Asmus Freytag. East Asian Width. 28 March 2005. Unicode Standard Annex #11. URL: http://www.unicode.org/unicode/reports/tr11/tr11-14.html
[UAX24]: Mark Davis. Script Names. 28 March 2005. Unicode Standard Annex #24. URL: http://www.unicode.org/unicode/reports/tr24/tr24-7.html
[UAX29]: Mark Davis. Text Boundaries. 25 March 2005. Unicode Standard Annex #29. URL: http://www.unicode.org/unicode/reports/tr24/tr24-7.html

13.2. Informative References

[标点符号]: 标点符号用法 (Punctuation Mark Usage). 中华人民共和国国家标准. 1995.
[JIS4051]: JIS X 4051:2004. Formatting rules for Japanese documents. (『日本語文書の組版方法』) Japanese Standards Association. 2004.
[UAX14]: Asmus Freytag. Line Breaking Properties. 29 March 2005. Unicode Standard Annex #14. URL: http://www.unicode.org/unicode/reports/tr14/tr14-17.html