IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Special Section on Data Engineering and Information Management
A Survey of Thai Knowledge Extraction for the Semantic Web Research and Tools
Ponrudee NETISOPAKULGerhard WOHLGENANNT
Author information
JOURNAL FREE ACCESS

2018 Volume E101.D Issue 4 Pages 986-1002

Details
Abstract

As the manual creation of domain models and also of linked data is very costly, the extraction of knowledge from structured and unstructured data has been one of the central research areas in the Semantic Web field in the last two decades. Here, we look specifically at the extraction of formalized knowledge from natural language text, which is the most abundant source of human knowledge available. There are many tools on hand for information and knowledge extraction for English natural language, for written Thai language the situation is different. The goal of this work is to assess the state-of-the-art of research on formal knowledge extraction specifically from Thai language text, and then give suggestions and practical research ideas on how to improve the state-of-the-art. To address the goal, first we distinguish nine knowledge extraction for the Semantic Web tasks defined in literature on knowledge extraction from English text, for example taxonomy extraction, relation extraction, or named entity recognition. For each of the nine tasks, we analyze the publications and tools available for Thai text in the form of a comprehensive literature survey. Additionally to our assessment, we measure the self-assessment by the Thai research community with the help of a questionnaire-based survey on each of the tasks. Furthermore, the structure and size of the Thai community is analyzed using complex literature database queries. Combining all the collected information we finally identify research gaps in knowledge extraction from Thai language. An extensive list of practical research ideas is presented, focusing on concrete suggestions for every knowledge extraction task - which can be implemented and evaluated with reasonable effort. Besides the task-specific hints for improvements of the state-of-the-art, we also include general recommendations on how to raise the efficiency of the respective research community.

Content from these authors
© 2018 The Institute of Electronics, Information and Communication Engineers
Previous article Next article
feedback
Top