Journal of Information Processing
Online ISSN : 1882-6652
ISSN-L : 1882-6652
Fast and Parallel RankClus Algorithm based on Dynamic Rank Score Tracking
Kotaro YamazakiShohei MatsuguHiroaki ShiokawaHiroyuki Kitagawa
Author information
JOURNAL FREE ACCESS

2020 Volume 28 Pages 453-461

Details
Abstract

Given a bi-type information network, which is an extended model of well-known bipartite graphs, how can clusters be efficiently found in graphs? Graph clustering is now a fundamental tool to understand overviews from graph-structured data. The RankClus framework accurately performs clustering for bi-type information networks using ranking-based graph clustering techniques. It integrates graph ranking algorithms such as PageRank or HITS into graph clustering procedures to improve the clustering quality. However, this integration incurs a high computational cost to handle large bi-type information networks since RankClus repeatedly computes the ranking algorithm for all nodes and edges until the clustering procedure converges. To overcome this runtime limitation, we present a novel RankClus algorithm that reduces the running time for large bi-type information networks. Our proposed method employs dynamic graph processing techniques into the iterative ranking procedures included in RankClus. By dynamically updating ranking results, our proposal reduces the number of computed nodes and edges during repeated ranking procedures. For further improving the efficiency, we also present a parallel implementation of our proposed algorithm by using thread-based parallelization on a modern manycore CPU. We experimentally verify using real-world datasets that our fast and parallel proposed methods successfully reduces the running time while maintaining the clustering quality of RankClus.

Content from these authors
© 2020 by the Information Processing Society of Japan
Previous article Next article
feedback
Top