デジタルアーカイブからの情報抽出技術 ：画像からのテキスト・図表の抽出

青池 亨

doi:10.24506/jsda.8.3_115

Abstract

Automatically extracting information such as text data and illustrations from images in digital archives and providing them to users is an approach that has been attracting attention in terms of full-text search support and accessibility improvement in conjunction with increasingly sophisticated machine learning. The National Diet Library (NDL) has a demonstration site called NDL Lab. It has implemented and released experimental functions of information extraction methods based on machine learning and has reflected the findings and user responses obtained in the development process of the NDL Digital Collections and other products. This paper describes the information extraction technology and introduces the findings obtained through the operation of experimental services at the NDL Lab.

Content from these authors

この記事はクリエイティブ・コモンズ [表示 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by/4.0/deed.ja

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!