自然言語処理
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
一般論文
Metric-Type Identification for Multilevel Header Numerical Tables in Scientific Papers
Lya Hulliyyatus SuadaaHidetaka KamigaitoManabu OkumuraHiroya Takamura
著者情報
ジャーナル フリー

2021 年 28 巻 4 号 p. 1247-1269

詳細
抄録

Numerical tables are widely used to present experimental results in scientific papers. For table understanding, a metric-type is essential to discriminate numbers in the tables. Herein, we introduce a new information extraction task, i.e., metric-type identification from multilevel header numerical tables, and provide a dataset extracted from scientific papers comprising header tables, captions, and metric-types. We propose joint-learning neural classification and generation schemes featuring pointer-generator-based and pretrained-based models. Our results show that the joint models can manage both in-header and out-of-header metric-type identification problems. Furthermore, transfer learning using fine-tuned pretrained-based models successfully improves the performance. The domain-specific of BERT-based model, SciBERT, achieves the best performance. Results achieved by a fine-tuned T5-based model are comparable to those obtained using our BERT-based model under a multitask setting.

著者関連情報
© 2021 The Association for Natural Language Processing
前の記事 次の記事
feedback
Top