Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 3E1-GS-10-05
Conference information

Production of MusicXML from Locally Inclined Sheetmusic Photo Image by Using Measure-based Multi-modal Deep-learning-driven Assembly Method
img2Mxml App for playing music from smartphone sheetmusic photo images
*Tomoyuki SHISHIDOFehmiju FATIDaisuke TOKUSHIGEYasuhiro ONOItsuo KUMAZAWA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Deep learning has been applied to optical music sheet recognition (OMR). However, OMR processing from various sheet-music images still lacks precision to be widely applicable. We propose a measure-based multimodal deep-learning-driven assembly (MMdA) method enabling end-to-end OMR processing from various images including inclined photo images. Using this method, measures are extracted using a deep-learning model, aligned, and resized to be used for inference of given musical-symbol components by using multiple deep-learning models in sequence or in parallel. The use of each standardized measure enables efficient training of the deep-learning models and accurate adjustment of five staff lines in each measure, which enables locally inclined sheet-music images to be precisely positioned. Thus, a score can be reproduced from the inclined image with the proposed MMdA method while current OMR applications cannot. Multiple musical-symbol-component deep-learning feature-category models with a small number of feature types can represent a diverse set of notes and other musical symbols including chords. The proposed MMdA method provides a solution to end-to-end OMR processing and enhances the utility of OMR of mobile phone- based sheet-music photo images.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top