School of Electronic Engineering and Computer Science

PhD in Studentship in Artificial Intelligence and Music

Application closing date: 21/06/2019
Start date: Sept 2019
Research group: Centre for Doctoral Training in Artificial Intelligence and Music (AIM)

Duration: 3 years
Funding available

School of Electronic Engineering and Computer Science
4-year PhD Studentship in Artificial Intelligence and Music

A fully funded PhD position is available in the UKRI Centre for Doctoral Training in Artificial Intelligence and Music (AIM) in collaboration with Steinberg Media Technologies GmbH, focussing on Optical Music Recognition (OMR) and Deep Learning.

Why apply to the AIM Programme?

  • 4-year fully-funded PhD studentship
  • Access to cutting-edge facilities and expertise in artificial intelligence (AI) and music/audio technology
  • Comprehensive technical training at the intersection of AI and music through a personalised programme
  • Partnerships with over 20 companies and cultural institutions in the music, audio and creative sectors

More information on the AIM Programme can be found at:


PhD Topic: Optical Music Recognition using Deep Learning
in collaboration with Steinberg Media Technologies GmbH
The proposed PhD focuses on developing novel techniques for optical music recognition (OMR) using Deep Neural Networks (DNN). The research will be carried out in collaboration with Steinberg Media Technologies opening the opportunity to work with and test the research outcomes in leading music notation software such as Dorico ( 

Musicians, composer, arrangers, orchestrators and other users of music notation have long had a dream that they could simply take a photo or use a scan of sheet music and bring it into a music notation application to be able to make changes, rearrange, transpose, or simply listen to being played by the computer. The PhD aims to investigate and demonstrate a novel approach to converting images of sheet music into a semantic representation such as MusicXML and/or MEI.

The research will be carried out in the context of designing a music recognition engine capable of ingesting, optically correcting, processing and recognising multiple pages of handwritten or music from image captured by mobile phone, or low-resolution copyright-free scans from the International Music Score Library Project (IMSLP). The main objective is outputting semantic mark-up identifying as many notational elements and text as possible, along with the relationship to their position in the original image. Prior solutions have used algorithmic approaches and have involved layers of algorithmic rules applied to traditional feature detection techniques such as edge detection. An opportunity exists to develop and evaluate new approaches based on DNN and other machine learning techniques. 

State-of-the-art Optical Music Recognition (OMR) is already able to recognise clean sheet music with very high accuracy, but fixing the remaining errors may take just as long, if not longer, than transcribing the music into notation software by hand. A new method that can improve recognition rates will allow users who are not so adept at inputting notes into a music notation application to get better results quicker. Another challenge to tackle is the variability in quality of input (particularly from images captured from smartphones) and how best to preprocess the images to improve the quality of recognition for subsequent stages of the pipeline.

The application of cutting edge techniques in data science, including machine learning, particularly convolutional neural networks (CNN) may yield better results than traditional methods. To this end, research will start from testing VGG like architectures ( and residual networks (e.g. ResNet, for the recognition of handwritten and/or low-resolution printed sheet music. The same techniques may also prove useful in earlier stages of the pipeline such as document detection and feature detection. It would be desirable to recognise close to all individual objects in the score.

One of the first objectives will be to establish the methodology for determining the differences between the reference data and the recognised data. Furthermore data augmentation can be supported by existing Steinberg software. The ideal candidate would have previous experience of training machine learning models and would be familiar with Western music notation. Being well versed in image acquisition, processing techniques, and computer vision would be a significant advantage. 

AIM Programme structure
Our Centre for Doctoral Training (CDT) offers a four-year training programme where students will carry out a PhD in the intersection of AI and music, supported by taught specialist modules, industrial placements, and skills training. Find out more about the programme structure at:

Who can apply?
We are on the lookout for the best and brightest students interested in the intersection of music/audio technology and AI. Successful applicants will have the following profile:

  • Hold or be completing a Masters degree at distinction or first class level, or equivalent, in Computer Science, Electronic Engineering, Music/Audio Technology, Physics, Mathematics, or Psychology.
  • Programming skills are strongly desirable; however, we do not consider this to be an essential criterion if candidates have complementary strengths.
  • Formal music training is desirable, but not a prerequisite.

For the above PhD topic, we are offering a scholarship to both ordinarily resident in the UK and international students.

This is a fully-funded 4-year PhD studentship starting in September 2019 which will cover the cost of tuition fees and will provide an annual tax-free stipend of £17,009. The CDT will also provide funding for conference travel, equipment, and for attending other CDT-related events.

Apply Now
Information on applications and PhD topics can be found at:

Application deadline: 21 June 2019

For further information on eligibility, funding and the application process please visit our website. Please email any questions to