Abstract
A new method for the real-time detection of facial expressions from time-sequential images is proposed. The proposed method does not require the use of tape marks that are pasted to the face to detect expressions in real-time in the current implementation for virtual space teleconferencing. In the proposed method, four windows are applied to four areas of a facial image : left and right eyes, mouth and forehead. Each window is divided into blocks consisting of 8 by 8 pixels. Discrete cosine transform (DCT) is applied to each block, and the feature vector of each window is obtained by taking the summations of the DCT energies in the horizontal, vertical and diagonal directions. To convert the DCT features to virtual tape mark movements, we represent the displacement of a virtual tape mark by a polynomial of the DCT features for the three directions. We apply a genetic algorithm to train facial expression image sequences to find the optimal set of coefficients that minimizes the difference between the real and converted displacements of the virtual tape marks. Experimental results shows the effectiveness of the proposed method.