Genome Informatics
Online ISSN : 2185-842X
Print ISSN : 0919-9454
ISSN-L : 0919-9454
A Graph-Based Clustering Method for a Large Set of Sequences Using a Graph Partitioning Algorithm
Hideya KawajiYosuke YamaguchiHideo MatsudaAkihiro Hashimoto
著者情報
ジャーナル フリー

2001 年 12 巻 p. 93-102

詳細
抄録

A graph-based clustering method is proposed to cluster protein sequences into families, which automatically improves clusters of the conventional single linkage clustering method. Our approach formulates sequence clustering problem as a kind of graph partitioning problem in a weighted linkage graph, which vertices correspond to sequences, edges correspond to higher similarities than given threshold and are weighted by their similarities. The effectiveness of our method is shown in comparison with InterPro families in all mouse proteins in SWISS-PROT. The result clusters match to InterPro families much better than the single linkage clustering method. 77% of proteins in InterPro families are classified into appropriate clusters.

著者関連情報
© Japanese Society for Bioinformatics
前の記事 次の記事
feedback
Top