Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
26th (2012)
Session ID : 3M2-IOS-3b-2
Conference information

Utilising Bilingual Lexical Resources for Technical Term Extraction
*Chaimongkol PANOTPontus STENETORPAkiko AIZAWA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Technical Term Extraction (TTE) is the task of detecting mentions of technical terms in scientific texts, thus it can be framed as a special case of Named Entity Recognition (NER). TTE is a stepping-stone to perform semantic analysis of scientific texts and is essential for information extraction and knowledge retrieval. For NER, annotated resources are commonly coupled with supervised learning methods to produce and evaluate state-of-the-art systems. However, the current lack of annotated resources for TTE hampers further research efforts. To perform a preliminary study we induce annotations by exploiting author keywords assigned to scientific texts. We construct a baseline system by training a Conditional Random Field model and a set of well-established NER features. Furthermore we examine potential benefits of incorporating extra linguistic resources for TTE utilising bilingual dictionary resources. Mere dictionaries, however, is not enough to identify technical terms; notation variation, polysemy, homography, and other ambiguities must be clarified using information from co-occurrence of words or context. It is our hypothesis that bilingual dictionaries are promising for disambiguation of meanings by looking at cross-language information. We incorporate features from bilingual dictionaries and evaluate it towards our baseline model and find that there are potential benefits for our proposed model.

Content from these authors
© 2012 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top