抄録
We propose a video completion method for automatically removing unnecessary objects from video. Using this method, a user selects an object from a frame on our interface, and the corresponding objects are extracted from all frames of the video. Based on estimated camera/object motions, visually plausible video completion with spatio-temporal features is achieved by repeatedly using the temporal correlation between adjacent frames and the spatial correlation between adjacent pixels in a frame. Experiments were conducted that targeted and removed video captions to evaluate processing time and visual plausibility.