IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Regular Section
Robust Singing Transcription System Using Local Homogeneity in the Harmonic Structure
Hoon HEOKyogu LEE
Author information
JOURNAL FREE ACCESS

2017 Volume E100.D Issue 5 Pages 1114-1123

Details
Abstract

Automatic music transcription from audio has long been one of the most intriguing problems and a challenge in the field of music information retrieval, because it requires a series of low-level tasks such as onset/offset detection and F0 estimation, followed by high-level post-processing for symbolic representation. In this paper, a comprehensive transcription system for monophonic singing voice based on harmonic structure analysis is proposed. Given a precise tracking of the fundamental frequency, a novel acoustic feature is derived to signify the harmonic structure in singing voice signals, regardless of the loudness and pitch. It is then used to generate a parametric mixture model based on the von Mises-Fisher distribution, so that the model represents the intrinsic harmonic structures within a region of smoothly connected notes. To identify the note boundaries, the local homogeneity in the harmonic structure is exploited by two different methods: the self-similarity analysis and hidden Markov model. The proposed system identifies the note attributes including the onset time, duration and note pitch. Evaluations are conducted from various aspects to verify the performance improvement of the proposed system and its robustness, using the latest evaluation methodology for singing transcription. The results show that the proposed system significantly outperforms other systems including the state-of-the-art systems.

Content from these authors
© 2017 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top