Abstract
In order to reduce various onomastic expressions for efficient tweet topic retrieval/clustering, a construction method of twitter dictionaries based on tweets extraction and their time-correlation is proposed. In the proposed method, similarities between keywords are calculated by the time-correlation of each word and co-occurrence probability. Furthermore, the proposed method divides the target time line to reduce the computational cost of twitter dictionaries construction. Through experiments with 101,714 tweets with the hashtags related to ``NHK kohaku-utagassen'', the effectiveness of the proposed division method compared with the method calculated using entire time line region is confirmed.