1997 年 97 巻 p. 259-267
In the authors' previous studies, we developed a method of categorizing causes of collision accidents based on documents contained in the Reports of Judgement on Marine Accident Inquiry Agency. We first constructed a dictionary of technical terms manually. Terms for categorization were extracted from the Reports by the longest part matching method using this dictionary. The frequencies of these terms in the documents were then weighted to be used for categorization. In the present study, we constructed the dictionary automatically and evaluated the validity of it by the cross validation technique. Three kinds of terms for categorization were generated from a dictionary of navigation technical terms. They were (1) Combination of navigation technical terms. (2) Prefixes + navigation technical terms, or navigation technical terms+suffixes. (3) Kanji characters + longest matching part of two navigation technical terms. Navigation technical terms which occur on the Reports were also added to the generated terms and a dictionary of 628 terms was constructed by computer processing. The rate of successful categorization using the above dictionary increased by 5.7% compared to those found in the authors' previous research. It was also found that terms of low frequency have little contribution to catergorization.