Some articles are brand new and others were originally part of a tutorial, but have been updated and amplified to bring HTML5 to the fore and incorporate feedback from various readers. The articles are:
- Character encodings: Essential concepts
- Choosing & applying a character encoding
- Declaring character encodings in HTML
- The byte-order mark (BOM) in HTML
- Normalization in HTML and CSS
- Characters or markup?
Together these articles, with several other existing articles that were updated at the same time, provide practical advice to content authors on how to handle character encodings in HTML and CSS. [search keys: article-definitions-characters qa-choosing-encodings qa-html-encoding-declarations qa-byte-order-mark qa-html-css-normalization qa-chars-vs-markup]
Numerous changes were made to this article to address feedback and also incorporate material on CSS escapes from the character encoding tutorial. This and other changes are described below. View the article.
German, Spanish, and Brazilian and Iberian Portuguese translators should consider updating it.
The article Who uses Unicode? was rewritten to reflect the fact that Unicode-encoded web pages now account for over 50% of the Web, as determined by Google.
Spanish and Polish and Brazilian Portuguese translators should consider retranslating the article. [search keys: qa-who-uses-unicode]
Answers the question: Should I use b and i elements?
The HTML5 specification redefines b and i elements to have some semantic function, rather than purely presentational. However, the simple fact that the tag names are 'b' for bold and 'i' for italic means that people are likely to continue using them as a quick presentational fix.
This article explains why that can be problematic for localization (and indeed for restyling of pages in a single language), and echoes the advice in the specification intended to address those issues.
By Richard Ishida, W3C. [search key: qa-b-and-i-tags]
FAQ-based article: Which language tag is right for me? How do I choose language and other subtags?
Following the publication of RFC 5646 earlier this year (replacing RFC 4646 as part of BCP 47), the IANA Subtag Registry now contains almost 8,000 subtags, and the list of subtag types was increased with the introduction of extended language subtags. This article tries to simplify the choice of an appropriate language tag for your needs by outlining the necessary decisions in a step-wise fashion.
By Richard Ishida, W3C. [search key: qa-choosing-language-tags]
This tutorial was updated to incorporate changes made to BCP 47 by the recent publication of RFC 5646. Changes to BCP 47 include the introduction of extended language subtags, and the addition of ISO 639-3 language subtags, bringing the total number of subtags in the registry to almost 8,000.
Translators should consider retranslating the whole tutorial. [search keys: article-language-tags]
On 15th September, the Internationalization Core Working Group published Requirements for String Identity Matching and String Indexing as a Working Group Note.
This document is being published as a Working Group note in order to capture and preserve historical information. It contains requirements elaborated in 1998 for aspects of the character model for W3C specifications. It was developed and extensively reviewed by the Internationalization Working Group, but never progressed beyond Working Draft status. For this publication, the wording of the 1998 version remains unchanged (except for correction of a small number of typographic errors), but the links to references have been updated prior to this publication.
The document describes requirements for some important aspects of the character model for W3C specifications. The two aspects discussed are string identity matching and string indexing.
Editor: Martin Dürst. [search keys: tr-charreq]