Our Ontologies (sets of concepts and relationships) such as Music Ontology have formed the basis of the BBC Programmes Ontology, used in the BBC “/programmes” website (2.5 million viewers / week). We have worked with the MusicBrainz open source music encyclopaedia to publish its information as linked data. Our ontologies have also been used by Libre.fm/GNU FM, the semantic web Upper Mapping and Binding Exchange Layer (UMBEL), and the Press Association.
Our research and the Ontologies we have developed has had an economic impact, an impact on public policy and services and an impact on society, culture and creativity. One of the main commercial assets of large-scale media content providers like the BBC is their massive and developing library of content. Semantic web technologies are key to enabling fine-grained real-time access to this. Our work is assisting the BBC to develop this access, in turn providing enhanced information to the public about television and sports. It has also enabled other music-based services. The Events ontology has been taken outside the music domain and used for news and sports and other applications.
C4DM has been one of the pioneers developing Linked Open Data technologies. Within SIMAC and OMRAS2 projects, we designed and published some of the first large-scale Semantic Web resources. Since 2007, our dbtune.org server has been publishing several key sets of music-related Linked Open Data derived from: last.fm music catalogue; BBC (playcounts and John Peel sessions); Musicbrainz open music encyclopaedia; MySpace; and Magnatune label. It delivers over 30G/month data traffic.
The C4DM Music Ontology (including Event and Timeline ontologies) has supported improvements to web presence, navigation between broadcast brands, content management, information sharing with communities and journalism. Previously, websites of specific programmes and broadcast brands that do not naturally link would be entirely separate, but these ontologies allow links to be easily created for cross-domain navigation. The ontologies also allow data to be made available to 3rd party developers e.g. via “BBC backstage”. They also allow community-curated data to be linked to BBC information, enhancing programme websites with additional relevant information. For example, the BBC Music website uses these ontologies to link data from the MusicBrainz community encyclopaedia (see below), providing additional information about artists currently played on BBC media outlets, and provide recommendations using cultural information available on the Semantic Web. This BBC Music service was announced in late 2008 [Im7].
The new “BBC Programmes” website (Fig 1) describes and links the content of 1,000 to 1,500 programmes in the BBC’s daily output, and has some 2.5 million users per week. To support this, the BBC created its Programmes Ontology [Im8], based on the Music Ontology, and using Event and Timeline ontologies [Im1, Im9].
The “Event Ontology” was used to underpin the experimental “Mythology Engine” which allows people to explore BBC dramas, and is used in a range of sporting events such as the BBC World Cup 2010 website, to improve journalism and coverage of the 2010 Winter Olympics, and was used as the ontological backbone for the Olympics 2012 website which averaged 7.1M daily online unique browsers [Im1].
MusicBrainz is a community-maintained open source encyclopaedia of music information. Through a JISC-funded project (Linked Music Metadata, PI: Dixon, 2010-2011 £94,894), we worked with MusicBrainz to publish its database as Linked Data [Im2]. MusicBrainz now provides its information as RDF linked data on each of their pages, a very large dataset containing some 23.8GB of NTriples, and about 180M assertions. This enables third party developers to access music metadata in a machine readable format from the browser. To illustrate usage, MusicBrainz is set to a global limit of 300 requests per second, and will decline requests beyond that.
This Case Study focuses on research undertaken in the Centre for Digital Music (C4DM) into Semantic Web technologies for Music Informatics. C4DM research into Music Informatics is grounded in the use of digital signal processing (DSP) for extracting features from musical audio. In 1998 we published one of the field’s earliest papers, on automatic music genre analysis [R1]. This led to a JISC/NSF Digital Libraries co-funded project, OMRAS (On-line Music Recognition and Search, www.omras.org, 1999-2003) with Sandler, Oxford University and University of Massachusetts (Amherst). Sandler (Professor of Signal Processing) and other key researchers (Bello, Reiss) and academics (Plumbley) moved to Queen Mary in 2001. The OMRAS project (now at Queen Mary and Goldsmiths) pioneered the use of audio queries to search symbolic music databases, summarized in [R2]. It also established the ISMIR (International Symposium on Music Information Retrieval) conference series (ismir2000.ismir.net), which grew to over 260 delegates in 2012, itself leading to the establishment of an international society.
The EU FP6 SIMAC project (mtg.upf.edu/static/semanticaudio, 2004-2006) was one of the first major European Music Informatics projects. C4DM was responsible for developing and defining feature extraction algorithms, including rigidly defined structured semantics for audio features. This led to a major EPSRC ICT “Large Grant” project OMRAS2 (http://www.omras2.org, 2006-2010, £2.2M). The SIMAC and OMRAS2 projects developed and pioneered the use of Semantic Web technologies for music and other media content, both through the development and release of “Ontologies” (defined sets of concepts and relationships) such as the Music Ontology and Event Ontology, [R3] and the provision of some of the first Open Linked Data servers in the field [R4].
Led by Sandler and Plumbley a wide range of ontologies have been developed at C4DM by group members since 2007, including: Music Ontology (Raimond, Abdallah, Jacobson, Fazekas); Fundamental ontologies such as Event and Timeline Ontologies (Abdallah, Raimond); Audio Features Ontology (Raimond, Pastor Escuerdo, Cannam, Jacobson, Fazekas et al); Similarity Ontology (Jacobson, Raimond); Studio Ontology (Fazekas); Temperament Ontology (Fazekas, Tidhar); Audio Effects Ontology [R5] (Wilmering, Fazekas); Instrument Ontology [R6] (in progress: Kolozali, Fazekas, Barthet).
Tools developed by C4DM group members utilising Semantic Web technologies in Music Informatics research include: Sonic Visualiser (Cannam), which reads and writes Music Ontology RDF (Resource Description Framework); Sonic Annotator (Levy, Cannam), a batch annotation tool, which produces RDF output using the Music Ontology; SAWA (Fazekas), a Web based demonstrator of Semantic Web technologies; Hotttabs (Anglade, Barthet, Fazekas, Kolozali, Macrae), a guitar tab and video tool which uses the Music Ontology.
Isophonics.net is the home for software and data resources provided publically by C4DM as part of its open source research policy. For the Semantic Web these resources include DBTune ( http://dbtune.org/ ), which provides access to music-related structured data, in a Linked Data fashion. It hosts SPARQL end-points exposing interlinked music related data from Magnatune, Jamendo, The BBC John Peel sessions, Last-FM, MySpace and MusicBrainz. Our list of SPARQL end-points is at http://ismir2009.grasstunes.net/taxonomy/term/5. Also, the “Reference Annotations” of ground truth data from C4DM at http://isophonics.net/datasets are also available in Music Ontology RDF.