抄録
We use color and shape information for detecting and locating human faces in two-dimensional natural scene images. A color input image is first segmented using prespecified domains of hue and saturation that describe the color of human skin. After theresholding in hue and saturation, we group regions of the binarized image which have been classified as potential face candidates into a small number of clusters of connected pixels. We then implement a shape analysis by calculating moments for each cluster that are translation, scale and in-plane rotation invariant. In order to distinguish faces from distractors, a multilayer perceptron neural network is then used with the invariant moments as the input vector. Supervized learning of the network is implemented with the backpropagation algorithm, at first for frontal views of faces.