This snapshot, taken on
27/09/2010
, shows web content acquired for preservation by The National Archives. External links, forms and search may not work in archived websites and contact details are likely to be out of date.
 
 
The UK Government Web Archive does not use cookies but some may be left in your browser from archived websites.

W3C

Web Architecture

Web Architecture focuses on the foundation technologies and principles which sustain the Web, including URIs and HTTP.

Architecture Principles Header link

Web Architecture principles help to design technologies by providing guidance and articulating the issues around some specific choices.

Identifiers Header link

We share things by their names. URL, URI, IRI is the way to name things on the Web and manipulate them. Some additional addressing needs in the Web Services stack motivated some additional layers.

Protocols Header link

Protocols are the vehicle for exchanging our ideas. HTTP is the core protocol of the Web. W3C also is also working on XML Protocols and SOAP in relation to Web Services.

Meta Formats Header link

XML, the Extensible Markup Language, is used to build new formats at low cost (due to widely available tools to manipulate content in those new formats). RDF and OWL allow people to define vocabularies (“ontologies”) of terms as part of the Semantic Web.

Protocol and Meta Format Considerations Header link

Documents on the Web are loosely joined pieces by identifiers. It creates a maze of rich interactions between protocols and formats.

Internationalization Header link

W3C has worked with the community on the internationalization of identifiers (IRIs) and a general character model for the Web.

News Atom

Now ready for your review: Last Call Working Draft of WAI-ARIA, the Accessible Rich Internet Applications technical specification for making dynamic, interactive Web content accessible to people with disabilities. Working Drafts of 4 other documents in the WAI-ARIA suite are also updated. See: Call for Review: WAI-ARIA, the Accessible Rich Internet Applications technical specification, WAI-ARIA Overview, How WAI Develops Accessibility Guidelines through the W3C Process: Milestones and Opportunities to Contribute. Please send any comments on this Last Call Working Draft by 29 October 2010.    (2010-09-16)

The MultilingualWeb project, funded by the European Commission and coordinated by the W3C, is looking at best practices and standards related to all aspects of creating, localizing and deploying the multilingual Web. The project will raise visibility of what's available and identify gaps via a series of four events, over two years.

The first workshop takes place in Madridon 26-27 October 2010.

Many interesting speaker proposals have already been submitted, and the program committee has also now confirmed lead speakers for each of the main workshop sessions from the following organizations:

Internationalizers: W3C
Creators: BBC World Service
Localizers: SAP
Users: Facebook
Machines: DFKI
Policy makers: Localisation Research Centre

See the Call for Participationfor details about how to register for the workshop.

In particular, if you wish to speak at this event, and haven't yet submitted a proposal, please send an expression of interest (see the CFP) by September 17th.

The article Who uses Unicode? was rewritten to reflect the fact that Unicode-encoded web pages now account for over 50% of the Web, as determined by Google.

Spanish and Polish and Brazilian Portuguese translators should consider retranslating the article. [search keys: qa-who-uses-unicode]

Read more! »

For those of you interested in deploying RDF on the Web, I'd like to draw your attention to three new proposed standards from IETF, " Web Linking ", " Defining Well-Known URIs ", and " Web Host Metadata", that create new follow-your-nose tricks that could be used by semantic web clients to obtain RDF connected to a URI - RDF that presumably defines what the URI 'means' and/or describes the thing that the URI is supposed to refer to.

Most semantic web application developers are probably familiar with three ways to nose-follow from a URI:

  1. For # URIs - for X#F, the document X tells you about <X#F>
  2. When the response to GET X is a 303 - the redirect target tells you about <X>
  3. When the response to GET X is a 200 - the content may tell you about <X>

In case 3, X refers to what I'll call a "web page" (a more technical term is used in the TAG's httpRange-14 resolution). One of the new RFCs extends case 3 to situations where the RDF can't be embedded in the content, either because the content-type doesn't provide a place to put it (e.g. text/plain) or because for administrative reasons the content can't be modified to include it (e.g. a web archive that has to deliver the original bytes faithfully). The others cover this case as well as offering improved performance in case 2.

Web pages as RDF subjects

Before getting into the new nose-following protocols, I'll amplify case 3 above by listing a few applications of RDF in which a web page occurs as a subject. I'll rather imprecisely call such RDF "metadata".

  1. Bibliographic metadata - tools such as Zotero might be interested in obtaining Dublin Core, BIBO, or other citation data for the web page.
  2. Stability metadata - for annotation and archiving purposes it may be useful to know whether the page's content is committed to be stable over time (e.g. this has changing content versus this has unchanging content ). See TimBL's Generic Resources note.
  3. Historical and archival metadata - it is useful to have links to other versions of a document - including future versions.

All sorts of other statements can be made about a web page, such as a type (wiki page, blog post, etc.), SKOS concepts, links to comments and reviews, duration of a recording, how to edit, who controls it administratively, etc. Anything you might want to say about a web page can be said in RDF.

Embedded metadata is easy to deploy and to access, and should be used when possible. But while embedded metadata has the advantages of traveling around with the content, a protocol that allows the server responsible for the URI to provide metadata over a separate "channel" has two advantages over embedded metadata: First, the metadata doesn't have to be put into the content; and second, it doesn't have to be parsed out of the content. And it's not either/or: There is no reason not to provide metadata through both channels when possible.

Link: header

The 'Web Linking' proposed standarddefines the HTTP Link: header, which provides a way to communicate links rooted at the requested resource. These links can either encode interesting information directly in the HTTP response, or provide a link to a document that packages metadata relevant to the resource.

In the former case, one might have:

Link: <http://xmlns.com/foaf/0.1/Document>;
  rel="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"

meaning that the request URI refers to something of type foaf:Document. In the latter case one might have:

Link: <http://example.com/about/foo.rdf>;
  rel="describedby"; type=application/rdf+xml

meaning that metadata can be found in <http://example.com/about/foo.rdf>, and hintingthat the latter resource might have a 'representation' with media type application/rdf+xml.

Host-wide nose-following rules

The motivation for the "well-known URIs" RFC is to collect all "well-known URIs" (analogous to "robots.txt") in a single place, a root-level ".well-known" directory, and create a registry of them to avoid collisions. The most pressing need comes from protocols such as webfinger and OpenID; see Eran Hammer-Lahav's blog postfor the whole story.

For linked data, .well-known provides an opportunity for providing metadata for web pages, as well improving the efficiency of obtaining RDF associated with other "slash URIs", what is currently done using 303 responses.

Ever since the TAG's httpRange-14 decision in 2005, there have been concernsthat it takes two round trips to collect RDF associated with a slash URI. While some might question why those complaining aren't using hash URIs, in any case the "well-known URIs" mechanism gives a way to reduce the number of round trips in many cases, eliminating many GET/303 exchanges.

The trick is to obtain, for each host, a generic rule that will transform the URI at that host that you want RDF for into the URI of a document that carries that RDF. This generic rule is stored in a file residing in the .well-known space at a path that is fixed across all hosts. That is: to find RDF for http://example.com/foo, follow these steps:

  1. obtain the host name, "example.com"
  2. form the URI with that host name and path "/.well-known/host-meta", i.e. "http://example.com/.well-known/host-meta" (see here)
  3. if not already cached,fetch the document at that URI
  4. in that document find a rule generically transforming original-URI -> about-URI
  5. apply the rule to "http://example.com/foo" obtaining (say) "http://example.com/about/foo"
  6. find RDF about "http://example.com/foo" in document "http://example.com/about/foo"

The form of the about-URI is chosen by the particular host, e.g. "http://example.com/foo,about" or "http://about.example.com/foo" or whatever works best.

Why is this fewer round trips than using 303? Because you can fetch and cache the generic rule once per site. The first use of the rule still costs an extra round trip, but subsequent URIs for a given site can be nose-followed without any extra web accesses.

A worked example can be found here.

Next steps

As with any new protocol, figuring out exactly how to apply the new proposed standards will require coordination and consensus-building. For example, the choice of the "describedby" link relation and "host-meta" well-known URI need to be confirmed for linked data, and agreement reached on whether multiple Link: headers is in good taste or poor taste. (Link: and .well-known put interesting content in a peculiarly obscure place and it might be a good idea to limit their use.) Consideration should be given to Larry Masinter's suggestion to use multiple relations reflecting different attitudes the server might have regarding the various metadata sources: For example the server may choose to announce that it wants the Link: metadata to override any embedded metadata, or vice versa. Agreement should be reached on the use of Link: and host-meta with redirects (302 and so on) - personally I think it would be a great thing as you could then use a value-added forwarding service to provide metadata that the target host doesn't or can't provide.

This is not a particularly heavy coordination burden; the design odds-and-ends and implementations are all simple. The impetus might come from inside W3C (e.g. via SWIG) or bottom-up. All we really need to get this going are a bit of community discussion, a server, and a cooperating client, and if the protocols actually fill a need, they will take off.

For past TAG work on this topic, please see TAG issue 62 and the " Uniform Access to Metadata" memo.

26-27 October 2010, Madrid. Hosted by the Universidad Politécnica de Madrid.

The MultilingualWeb projectis looking at best practices and standards related to all aspects of creating, localizing and deploying the Web multilingually. The project aims to raise the visibility of existing best practices and standards and identify gaps. The core vehicle for this is a series of four events which are planned for the coming two years.

As the first of the four events, this workshop will survey currently available best practices and standards aimed at helping content creators, localizers, tools developers, and others meet the challenges of the multilingual Web.

Participation is free. We welcome participation from both speakers and non-speaking attendees. For more information, see the Call for Participation

In addition to providing the basis for language identification on the Web, BCP 47 language tags also are used to control language and culturally specific APIs on many systems. Based on work done by the Unicode Consortium, the proposed Language Tag Extension 'U'provides additional subtags that can be used to refine locale-based details such as calendar, sort order, and other locale details.

More information on Unicode Locales is available at the Unicode CLDR website or in UTS #35, LDML.

WAI announces a Call for Review of draft updates to supporting documents for WCAG 2.0: Techniques for WCAG 2.0 (Editors' Draft) and Understanding WCAG 2.0 (Editors' Draft). (This is not an update to WCAG 2.0, which is a stable document.) To learn more about the updates, see Call for Review: WCAG 2.0 Techniques Draft Updates e-mail. Please submit comments by 9 August 2010.    (2010-07-08)

If you develop web authoring tools (content management systems, HTML editors, websites that let users add content, and more), now is the time to take a good look at the ATAG 2.0 Working Draft. It's in Last Call Working Draft stage, and we need you to use it in developing your tools and let us know how it works for you. People with disabilities and accessibility specialists are also encouraged to review it now. See: Call for Review: ATAG *Last Call* Working Draft, ATAG Overview, How WAI Develops Accessibility Guidelines through the W3C Process: Milestones and Opportunities to Contribute. Please send any comments on this Last Call Working Draft by 2 September 2010. Thanks!    (2010-07-08)

This checker performs various tests on a Web Page to determine its level of internationalisation-friendliness. It also lists key internationalization settings related to character encoding, language declarations, text direction and class/id names. This information includes HTTP headers, which can be particularly useful for troubleshooting problems.

The checker is still only a prototype, so there are guaranteed to be bugs and missing features. It will slowly improve over the coming months, but it has been made available for use now since it is likely to be helpful to many people already.

Use the checker

[search key:  tools-i18n-checker]

WAI has published updated Working Drafts of User Agent Accessibility Guidelines (UAAG) 2.0 and the Implementing UAAG 2.0 supporting Note. UAAG defines how browsers, media players, and other "user agents" should support accessibility for people with disabilities and work with assistive technologies. WAI encourages you to review UAAG 2.0 and submit comments now, as the Working Group is preparing for Last Call. See: Call for Review: UAAG 2.0 and Implementing UAAG 2.0 Working Drafts e-mail, User Agent Accessibility Guidelines (UAAG) Overview, How WAI Develops Accessibility Guidelines through the W3C Process: Milestones and Opportunities to Contribute. Please send comments by 29 July 2010. (2010-06-17)