Journal of Japan Society of Library and Information Science
Online ISSN : 2432-4027
Print ISSN : 1344-8668
ISSN-L : 1344-8668
Article
Automatic Identification of Duplicate Records and "Works" in Japanese Union Catalogs : An Experiment on UNICANET Bibliographic Records
Shoichi TANIGUCHI
Author information
JOURNAL OPEN ACCESS

2011 Volume 57 Issue 4 Pages 124-140

Details
Abstract

Automatic identification of duplicate records and "works" was tried on bibliographic records in UNICANET, a union catalog operated by the National Diet Library. Identifying duplicates is to group records representing the same resource while identifying "works" indicates to group records sharing the same work, being defined in FRBR. This paper reports the extent to which records can be automatically identified as members of a particular resource and of a particular work and also which of the possible alternatives are effective. The method used in this study is to extract data values from certain fields in records encoded in DC-NDL schema, to normalize those values, and then to generate identification keys to be matched with a database storing incrementally the identified records. Several ways of choosing fields and values for title and author name, combing the generated identification keys, and other choices were examined and grouping records was executed for each way. The record groups built automatically were evaluated by comparing them with the sample correct sets built manually. The results of the experiment show that automatic identification of duplicates and works is fully archived. It also shows that it is effective (a) to use the normalization proposed, (b) regarding the choices in titles, to adopt titles and their transcription comprehensively except series titles, and to apply the decomposition and recombination of titles while generating the title identification keys, and (c) as for authors, to adopt author names and their transcription comprehensively, and to take publishers when no author is found.

Content from these authors
© 2011 Japan Society of Library and Information Science
Previous article Next article
feedback
Top