人工知能学会全国大会論文集
Online ISSN : 2758-7347
第22回 (2008)
セッションID: 3B1-1
会議情報

Classification of Biomedical Text Articles Using Multi-Class Support Vector Machine
*Dollah Rozilawati
著者情報
会議録・要旨集 フリー

詳細
抄録

Overwhelming amount of biomedical paper abstracts has been accumulated week after week at PubMed Web site. This site is thus a rich source of life science as well as biomedical textual information, yet at the same time, it makes us a challenging task to retrieve and classify conceptually similar paper abstracts solely by the contents, not by their pre-defined categories, not by their linguistic similarities. We have observed that quite a few paper abstracts have two or more different categorical information. For instance, a paper abstract may describe both HIV/AIDS and cancer. Therefore we cannot completely rely on the categorical information based on linguistic similarity that could be extracted from abstracts alone. In this paper, we will describe a method for classifying biomedical paper abstracts not by their linguistic similarities, but by their content-based similarity with multi-class SVM, by taking four differently categorized diseases as examples. Specifically, we have collected paper abstracts which originally belong to HIV/AIDS, cancer, hepatitis, and thyroid categories. We will then merge and re-classify them with our proposed method. Finally we will compare our results with well-known MeSH terms that is a pre- defined way of providing us with different terminology of the same concepts available at PubMed Web site.

著者関連情報
© 2008 社団法人 人工知能学会
前の記事 次の記事
feedback
Top