Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
General Paper
Metric-Type Identification for Multilevel Header Numerical Tables in Scientific Papers
Lya Hulliyyatus SuadaaHidetaka KamigaitoManabu OkumuraHiroya Takamura
Author information
JOURNAL FREE ACCESS

2021 Volume 28 Issue 4 Pages 1247-1269

Details
Abstract

Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. Herein, we introduce a new information extraction task, i.e., metric-type identification from multilevel header numerical tables, and provide a dataset extracted from scientific papers comprising header tables, captions, and metric-types. We propose joint-learning neural classification and generation schemes featuring pointer-generator-based and pretrained-based models. Our results show that the joint models can manage both in-header and out-of-header metric-type identification problems. Furthermore, transfer learning using fine-tuned pretrained-based models successfully improves the performance. The domain-specific of BERT-based model, SciBERT, achieves the best performance. Results achieved by a fine-tuned T5-based model are comparable to those obtained using our BERT-based model under a multitask setting.

Content from these authors
© 2021 The Association for Natural Language Processing
Previous article Next article
feedback
Top