Skip to main content
School of Electronic Engineering and Computer Science

Maria Pilataki-Manika


PhD Student



Project Title:

Polyphonic Music Transcription using Deep Learning


Polyphonic Music Transcription using Deep Learning Introduction Automatic Music Transcription (AMT) is the process of converting an audio music signal into a form of music notation. This could be either MIDI, piano roll, sheet music or a combination of these.
Monophonic music transcription is considered to be a solved problem.

However, polyphonic music transcription is still an active research area and one of the fundamental problems within Music Signal Processing and Music Information Retrieval (MIR). The performance of existing transcription systems is limited and unable to match that of human experts.

Polyphonic AMT is a complex task due to the nature of music signals and the many subtasks required. In brief, since many notes are active concurrently, the harmonics produced interfere with each other. For a complete and accurate transcription, essential subtasks include multi-pitch estimation, note onset and offset detection, loudness estimation, instrument recognition, tempo tracking and time quantization, key estimation, detection of dynamics and expression, typesetting.

There are various techniques which have been utilized to approach the AMT problem. Both supervised and unsupervised learning methods have been used. The analysis can be split into frame-level (analysing pitches in each frame), note-level (note-tracking) and stream-level (analysing pitches from each instrument/voice source). Example techniques include traditional signal processing methods, probabilistic modelling including Probabilistic Latent Component Analysis (PLCA), Bayesian methods, Hidden Markov Models (HMM), Nonnegative Matrix Factorization (NMF), Support Vector Machines (SVM) and Deep Learning which would be the main focus of this PhD project.

Applications of AMT include MIR, interactive music systems, musicological analysis , music education, creation and production and music search, transcription tools and software as well as music-related equipment and effects.

C4DM theme affiliation:

Music Informatics