Automatic Music Transcription with End-to-End Deep Neural Networks
Automatic music transcription is defined as the problem of automatically converting a music recording into some form of music notation, and is considered a fundamental topic in the field of Music Information Retrieval. Applications of automatic music transcription include music content analysis, music production, music education and computational musicology. This PhD project will focus on deep learning methods for automatic music transcription. It will include the design of an end-to-end model that uses audio spectrogram or raw audio as input without any feature extraction process in the intermediate stages. The project will compare the performance of different neural network structures, design neural network-based music data representations and explore some optimization methods for automatic music transcription systems. The project will firstly focus on classical polyphonic piano transcription and then adapt the system for a Chinese instrument. The final system should be able to read audio recordings and output the corresponding quantized notes in a format that can be easily read by a notation software (e.g. LilyPond or MuseScore) and generate human-readable music sheets. A user study will be included to evaluate the performance of the designed AMT system.
C4DM theme affiliation:
Music Informatics, Machine Listening