Abstract
Since a kind of transition in the audio signals which is called audio-scene cut is generally utilized for segmentation of audio-visual materials, several audio-scene cut detection methods have been proposed. However, since most of the methods segment the audio signal in a fixed time interval before indexing, users cannot obtain the exact time of the audio-scene cut. In order to develop a more accurate audio-scene cut detection method, we utilize the fuzzy c-means algorithm. Our proposed method works as follows: (1) it detects all of the possible transitions in the audio signals without miss-detections; (2) it divides the audio signals into segments bounded by the transitions detected in (1); (3) it classifies the segments into the speech, music, and silence classes; (4) it merges the semantically similar adjacent segments by comparing their audio classes which are obtained in (3) and detects the audio-scene cuts. Further, in all of these steps, since the proposed method directly processes the sub-band data of the MPEG audio signal, the proposed method can be applied to the audio-visual indexing without any decoding procedures.