2003 Volume 123 Issue 7 Pages 1243-1252
In human-robot interaction, the ability to detect and share user’s attention is the minimal requirement for an intelligent robot, since it is very important for robot to know human’s internal state. Here we present an algorithm, which is based on face posture estimation and the spatiotemporal image processing, to calculate a saliency map in order to form share attention. After the face posture estimation, we introduce an elliptic cone to approximate the user’s visual field, whose axis is fitted to the user’s gaze line that is not necessary to be detected beforehand. A visual acuity map on the user’s retina can be yielded according to the formulization of human’s visual acuity. We calculate the saliency map in terms of recency-weighted average of visual acuity maps along time axis so that the dynamic scene (for example, the case that user’s gaze line is shifting to a new object or the gazed object is moving) can affect the saliency map calculation, as well as the moving image areas are tracked to propagate the value of the visual acuity map from the current frame to the next one. Finally, we use the saliency map to form share attention in human-robot interaction, and it is also manifested that it will be possible to detect the user’s attention by only considering face orientation even when the eyes cannot be observed clearly.