COSMOS: Supporting Empirical Social Scientific Research with a Virtual Research Environment


Our empirical research programme is contextualised in terms of the ‘coming crisis of empirical sociology’ (Savage and Burrows, 2007), which is located in the increasing asymmetry between traditional social scientific methods and the power of transactional data generated through the internet. This has led some commentators to question the extent to which University-based sociology and social science can compete with the data rich resources built into the marketing and data generation strategies of the large multi-national corporations that hold and marshal much of this transactional data.

The schools of Social Sciences (SOCSI) and Computer Science & Informatics (COMSCI) at Cardiff University have, over the past 18 months, established the SOCSI/COMSCI research network , an interdisciplinary research group with academic staff from both schools collaborating and sharing best practice in research and teaching. The SOCSI/COMSCI research network has already secured a funded ESRC Wales DTC 4 year postgraduate studentship, and an ESRC research grant to develop data harvesting and analysis methods and tools to detect tension and cohesion in online social networks. The ERSC grant has supported the network in developing the Cardiff Online Social Media Observatory (COSMOS), an information collection, archival and analysis engine for harvesting freely available socially significant data from sources such as social networking sites, blogs, micro-blogs, RSS feeds and Open Data (e.g. crime rates), and analyzing the harvested dataset to detect community tension and cohesion indicators.

We propose to enhance COSMOS and engage the wider social scientific research community by extending it to provide an innovative virtual research environment (VRE). Our ESRC project has developed the social data harvesting engine and a rule engine to detect social tension within the aggregated dataset; we now need to build on this with usable and adaptable user interfaces that allow the composition and orchestration of research processes that produce empirical results to research questions. Researchers need to be able to use COSMOS data and pose hypothetical “what-if” questions, trying different combinations of social data analysis methods to confirm or refute an informal hypothesis, and then stress testing it further until a coherent and arguable position emerges. We plan to include sentiment, tension, network and geospatial data analysis functionality in the COSMOS platform during the project.


• Define keyword and timestamp-specific queries to extract subsets of the COSMOS social dataset, or use real-time COSMOS data feeds (e.g. real-time Twitter feeds), as research data;
• Orchestrate the marshal research data through a selection of digital research tools such as sentiment and tension analysis, social network analysis, and geospatial plotting;
• Visualize the results of their selected data analysis by invoking open source data analysis and visualization tools such as ‘R’;
• Enable more interactive sessions to be supported with social media data, enabling a variety of “what-if” scenarios to be supported;
• Archive and share the analysis processes and results with other researchers for interoperable reuse and reproducible experiments.

Anticipated Outputs and Outcomes

• An integrated technological VRE solution to broaden the accessibility and simplify the introduction to such research tools
• Underpin the technology with a programme of institutional and national dissemination and training. We will provide a sustainable national training programme to train HIE social researchers in aspects of using social analysis research using our VRE
• Academic publications in leading international Social Science and Computer Science journals

Project Staff

Start date
1 August 2012
End date
31 March 2013
Funding programme
Digital infrastructure: Research programme
Research programme: Research tools
