Abstract
This paper investigates the task of document similarity judgment for interactive document clustering. We suppose one of the promising approaches for developing next generation of web search engines is to incorporate user feedback mechanism into constrained clustering. As a basis for designing such search engines, it is important to study the interface design that can reduce user' burden of giving feedback to a system. This paper focuses on the task of judging the similarity of two documents as the primitive task for user feedback, and compare 3 types of information to be presented to users: snippet, topic terms, and original text. In particular, snippets suitable for document similarity judgment are proposed, which consist of two kinds of snippets: common snippets showing the common part of documents, and specific snippets showing the difference between documents. An experiment is conducted with 21 test participants, who were asked to judge the similarity of document pairs based on the 3 conditions. Those conditions are compared in terms of judgment time and accuracy with ANOVA and chi-square analysis. The typical judging behavior of the participants is also investigated by an eye-tracking system.