This snapshot, taken on
02/07/2014
, shows web content acquired for preservation by The National Archives. External links, forms and search may not work in archived websites and contact details are likely to be out of date.
 
 
The UK Government Web Archive does not use cookies but some may be left in your browser from archived websites.
The aim of this project is to develop a multi-dimensional Word Tree interface which will allow users to search and browse within documents and across a corpus, and access instant visual representation of the language patterns surrounding any given word or phrase. Our goal is to increase access and usage of corpus resources, both by corpus linguists and by language teachers and learners. In order to reach our goal the interface will have to be accessible and fun for non-experts, whilst providing useful pattern information for all levels of stakeholder.

Adaptable and learnable user interfaces for research tools: The Word Tree Corpus Interface

Summary

A corpus is  ‘a collection of pieces of language text in electronic form, selected according to external criteria to represent, as far as possible, a language or language variety as a source of data for linguistic research’   (Sinclair 2005). In recent years a number of relatively small, highly targeted corpora have been created to support the investigation of specified  types of speech and writing, for the benefit of researchers, teachers and language students learning about the language requirements of specific domains. Generally the small corpora that are most used and cited are those which enable the easiest and fastest extraction of relevant data; potentially useful corpora are often underused because they lack an online interface, or offer an interface which is not completely fit for purpose because it was originally designed for lexicographers and information scientists working with much larger general corpora.

The most important information contained in a corpus concerns patterns of language use, but these patterns are often hard to discern when corpus data is presented in the standard way, using Key Words in Context (KWIC) concordance lines.  Our project will build a new kind of ‘Word Tree’ interface which will present these patterns visually, enabling users to interact with the surface layer of data, but also to enter increasingly complex digital environments where they can examine language patterns in wider contexts and gather statistical evidence to support research hunches.

During the project we will engage with different kinds of potential users, to find out what they want from the interface, and to ensure that any usability problems are identified and corrected.  We will have face-to-face  meetings with groups of corpus linguists, novice researchers, and language teachers and learners.  Additionally, we will make each new version of the  interface available online, so that a much larger group of stakeholders can try it out while we conduct remote usability testing.  We will reach stakeholders in the UK and also in a variety of HE institutions overseas. We will keep a blog diary of our progress at http://cuba.coventry.ac.uk/wordtree/

Objectives

The aim of this project is to develop a multi-dimensional Word Tree interface which will allow users to search and browse within documents and across a corpus, and access instant visual representation of the language patterns surrounding any given word or phrase.  Our goal is to increase access and usage of corpus resources, both by corpus linguists and by language teachers and learners. In  order to reach our goal the interface will have to be accessible and fun for non-experts, whilst providing useful pattern information for all levels of stakeholder.

Anticipated Outputs and Outcomes

  • Visualisation applets, with reference implementations for the BAWE corpus.
  • An open REST API to the corpus to significantly improve the usability of the resource, and to promote further visualisation development.
  • Clear, easily modifiable pattern library documentation for the Word Tree, explaining suitable use-cases and potential customisations for other corpora.·       
  • A publicly accessible, open source-code repository with feature discussion area.       
  • A project blog that documents the progress of the project.        
  • A technical report detailing design, testing and implementation, and describing the lessons learned in terms of learnability and usability design.        
  • A completion report.

Project Staff

Torsten Reimer
t.reimer@jisc.ac.uk
Programme Manager
JISC Executive

Documents & Multimedia


  • Portable Document Format (pdf) File [ 177 Kb ]
Summary
Start date
9 May 2011
End date
9 November 2011
Funding programme
Research infrastructure programme
Project website
Lead institutions
Coventry University
http://wwwm.coventry.ac.uk/Pages/index.aspx
Topic