Abstract
Video segmentation is an important step in many video processing applications. By observing that the video shot boundary is a multi-(temporal)-resolution edge phenomenon in the feature space, we develop a general framework to handle all types of transitions in a consistent manner. We employ the Wavelet technique to transform the video signals into their first order derivatives in the frequency domain, and utilize this information across multiple temporal resolutions to detect, classify, and locate both the CUT and GT transitions. We test our method using the MPEG7 video data set consisting of about 13 hours of video. The results demonstrate that our framework is effective, and it possesses good noise tolerance characteristics. As part of this work, we are proposing the use of MPEG7 data set together with the ground truth data as the standard test set for video segmentation research.