2008 Volume 5 Pages 216-226
The method that enabled to extract important topics from document clusters containing text documents of many subjects retrieved from Nikkei newspaper was developed. The hierarchical clustering algorithm, UPGMA was used to generate the tree structure of clusters according to the similarity of document vectors defined by noun words appeared in the documents. The document clustering revealed the intimate relationship with the process of the societal problem detection, classifying similar documents in each topical group and structuring the groups according to their contents. The method was evaluated by applying to the subject of the organizational hazards caused by Japanese industries during 1990-2005.