Abstract
This paper presents experiments conducted to evaluate an automatic video editing system, founded on vision-based head tracking, that clearly conveys face-to-face multiparty conversations, such as meetings, to viewers. Systems that archive meetings and teleconferences to effectively facilitate human communication are attracting considerable interest. Conventional systems use a fixed-viewpoint camera and simple camera selection based on participants' utterances. Unfortunately, they fail to adequately convey who is talking to whom and nonverbal information about participants etc., to viewers. To solve this problem, we previously proposed an automatic video editing system using vision-based head tracking. This paper describes subjective evaluation experiments in which videos of entire conversations with 3 participants were presented to viewers; the results confirm the effectiveness of our system.