日本航海学会論文集
Online ISSN : 2187-3275
Print ISSN : 0388-7405
ISSN-L : 0388-7405
海難審判庁裁決録の自動分類に関する研究-III : 辞書用語の作成方法
松村 尚志田中 穂積
著者情報
ジャーナル フリー

1997 年 97 巻 p. 259-267

詳細
抄録

In the authors' previous studies, we developed a method of categorizing causes of collision accidents based on documents contained in the Reports of Judgement on Marine Accident Inquiry Agency. We first constructed a dictionary of technical terms manually. Terms for categorization were extracted from the Reports by the longest part matching method using this dictionary. The frequencies of these terms in the documents were then weighted to be used for categorization. In the present study, we constructed the dictionary automatically and evaluated the validity of it by the cross validation technique. Three kinds of terms for categorization were generated from a dictionary of navigation technical terms. They were (1) Combination of navigation technical terms. (2) Prefixes + navigation technical terms, or navigation technical terms+suffixes. (3) Kanji characters + longest matching part of two navigation technical terms. Navigation technical terms which occur on the Reports were also added to the generated terms and a dictionary of 628 terms was constructed by computer processing. The rate of successful categorization using the above dictionary increased by 5.7% compared to those found in the authors' previous research. It was also found that terms of low frequency have little contribution to catergorization.

著者関連情報

この記事はクリエイティブ・コモンズ [表示 - 非営利 - 改変禁止 4.0 国際]ライセンスの下に提供されています。
https://creativecommons.org/licenses/by-nc-nd/4.0/deed.ja
前の記事 次の記事
feedback
Top