The technologies covered so far represent the fundamental building blocks of XML. We have seen how they may be applied in a web publishing context and how they are supported by government standards. However, this is just the tip of the iceberg. Since XML, XSL and XML Schema were developed, a whole new set of related technologies has emerged to deal with a different set of problems. This section introduces some of the most important ones.
A web service is an application that allows access to its code via the internet. This goes beyond the simple data exchange model and enables access to functionality. For example, we might retrieve a stock quote – we send the service the code for a company and it returns the current share price. This kind of transactional model is made possible by the family of XML-based technologies called Web Services.
Simple Object Access Protocol (SOAP)
SOAP is an XML vocabulary for invoking web services. It is a packaging protocol that tells us how to structure requests that we send to a service. HTTP is the most common transport protocol for sending and receiving SOAP messages. The following diagram shows the request-response model for SOAP over HTTP:
Figure 4.1: SOAP message exchange over HTTP
SOAP relies on XML Schema Part 2: Datatypes for the definition of basic datatypes. Furthermore, SOAP defines an encoding for arrays, structures and other compound data types for use in the applications being accessed. For more information on SOAP, refer to the W3C specification www.w3.org/TR/soap12/ [External Website]
Web Services Description Language (WSDL)
WSDL is an XML vocabulary for describing web services. A service provider typically publishes a WSDL file that tells a service consumer how to invoke the service in question. This is a description of the interface to the underlying application and includes method names, which variables to pass in, which variables are passed out and the associated data types. As with SOAP, WSDL relies on the data model provided by XML Schema to define variables. WSDL 1.1 is a W3C Note www.w3.org/TR/wsdl [External Website] and has been adopted by UK Government in Table 2 of the Technical Standards Catalogue 6.1 www.govtalk.gov.uk/egif/interconnection.asp#table2 [External Website]
As Web Services expose applications via the internet, the question of security inevitably arises. How can we ensure that consumers are who they say they are and how can we communicate privately with them once authenticated? These questions are addressed by a large number of standards and profiles. The following sub–sections look at the more established efforts.
XML Digital Signature
XML Digital Signature is an XML syntax for digitally signing objects such as XML documents. It defines how to associate a key with a piece of referenced data and the algorithms used for the creation of such keys. For more information see XML-Signature Syntax and Processing www.w3.org/TR/xmldsig-core/ [External Website]
XML Encryption
XML Encryption is an XML syntax for encrypting data including all or part of an XML document. It also describes the process for encrypting and decrypting the data. For more information see XML Encryption Syntax and Processing www.w3.org/TR/xmlenc-core/ [External Website]
XML Key Management
The XML Key Management Specification sets out the protocols for registering and distributing public keys and is used in conjunction with XML Digital Signature and XML Encryption. For more information see XML Key Management Specification 2.0 www.w3.org/TR/xkms2/ [External Website]
The Semantic Web is a vision of what the World Wide Web will evolve into: a web of data as opposed to a web of documents. It is seen as an extension of the current Web, geared towards machines being able to understand data, not just display it. XML provides the syntax for this language, which represents data and the rules for reasoning about data. Attaching meaning to data in this way is often referred to as knowledge representation.
Resource Description Framework (RDF)
XML alone cannot provide semantics. In the Semantic Web, metadata, in the form of RDF, adds a layer of meaning to the underlying structure. RDF is a language for asserting that things have properties with values. Many statements about the real world can be represented as triples in this way. For example, (deciduous trees, keeps leaves, false) or (evergreen trees, keeps leaves, true). The idea is to build up a vast network of triples – (thing, property, value) – where each thing, property and value is identified on the Web via a url(http://webarchive.nationalarchives.gov.uk/+/http://www.cabinetoffice.gov.uk/government_it/web_guidelines/xml/Uniform Resource Locator). Using this model, the vast majority of data processed by computers can be represented in a standard way and identified uniquely on the Web. For more information on RDF, refer to the RDF/XML Syntax Specification www.w3.org/TR/rdf-syntax-grammar/ [External Website]
Web Ontology Language (OWL)
Even with the layer of metadata provided by RDF, problems arise due to differences in terminology. For example, an address might refer to a physical location, a virtual location or a public speech. If different meanings are identified in different places on the Web, how is a machine to select the appropriate version? The answer lies with ontologies. An ontology is a taxonomy (or classification) of terms and a set of rules that governs the relationships between those terms. For example, a taxonomy of trees might include the following entries:
Figure 4.2: A taxanomy of trees
An ontology gives meaning to this by describing the lines that join the terms in the taxonomy.
Figure 4.3: A tree ontology
From this we make statements such as ‘a conifer is a type of evergreen’ and ‘deciduous is a type of tree’. In addition, we can infer statements such as ‘a conifer is a tree that keeps it leaves’. This might be obvious to a human being but a machine needs a well–defined ontology to deduce such things.
Note: Ontologies can also help to clear up semantic issues around terminology. For example, in Figure 4.3 above, a machine might confuse the concept leaves with pages in a book or the verb to leave. By defining equivalence relationships on concepts, ontologies can help to avoid this confusion.
The standard language for representing ontologies is the Web Ontology Language (OWL). For more information, refer to the OWL Web Ontology Language Reference www.w3.org/TR/owl-ref/ [External Website]
For your assistance – resources
W3C SOAP 1.2 Recommendation
www.w3.org/TR/soap12/ [External Website]
W3C WSDL 1.1 Note
www.w3.org/TR/wsdl [External Website]
Technical Standards Catalogue Table 2 – Specifications for Web Services
www.govtalk.gov.uk/egif/interconnection.asp#table2 [External Website]
W3C XML-Signature Syntax and Processing Recommendation
www.w3.org/TR/xmlenc-core/ [External Website]
W3C XML Encryption Syntax and Processing Recommendation
www.w3.org/TR/xmlenc-core/ [External Website]
W3C XML Key Management Specification 2.0 Recommendation
www.w3.org/TR/xkms2/ [External Website]
W3C RDF/XML Syntax Specification
www.w3.org/TR/rdf-syntax-grammar/ [External Website]
W3C OWL Web Ontology Language Reference
www.w3.org/TR/owl-ref/ [External Website]