2006 Volume 13 Issue 2 Pages 3-26
This paper proposes a mothod for detecting topic boundaries by topical conhesion profile measured by term repetition distances. A set of term repetitions composes topical potential. Total topical potentials compose topical cohesion profile that corresponds to dominant topics at hills and segement boundaries at valleys. Endlines of newspaper articles which are connected sequentially indicate topical segement boundaries. In the experiment the method is applied to test how many segement boundaries are detected at article endlines. The results of the experiment showed 67.8% in recall and 61.8% in precision. The method is available for long essays and effective to small texts.