Abstract
The present research has two objectives. The first objective is to introduce a method of data preprocessing particularly applicable to large-scale bibliographic databases in order to reduce the number of misidentified entries. Our method utilizes authors as points of reference in order to link error-prone affiliation IDs with theoretically error-free IDs by using carefully constructed inclusive and exclusive search terms. The result showed that our method effectively reduced the number of improper affiliation IDs. The second objective is to explore research productivities in various Japanese research organizations using a corrected database. Our preprocessing and article-classification methods offer fine and representative descriptions of the productivities of research organizations in Japan.