統計的指標を利用した特徴語抽出に関する研究

中條 清美; 内山 将夫

doi:10.20806/katejo.18.0_99

抄録

Earlier studies have established that the use of frequency of occurrence is effective in extracting specialized vocabulary from a corpus. What would happen if, rather than relying on solely frequency, a range of various statistical tools were used? In this study, eight individual and one 'F_<cum>' combination statistical analyses were evaluated for effectiveness in producing specialized vocabulary by comparing extracted lists to existing specialized vocabulary control lists. It was found that the 'F_<cum>' combination of measures created the most comparable data followed in effectiveness by the Dice coefficient. It was determined that all these measures were effective tools in producing beneficial specialized vocabulary, and that each measure created a unique list with regard to frequency, word length, type of word, and school textbook vocabulary coverage. While the use of frequency alone as a determiner of specialized vocabulary from a corpus is effective, the application of statistical tools provides even greater effectiveness in extracting various types of specialized lists which can be targeted to students' vocabulary or proficiency levels.

著者関連情報

お気に入り & アラート

閲覧履歴

後続誌

関東甲信越英語教育学会誌

責任著者(Corresponding author)

J-STAGEへの登録はこちら（無料）