抄録
We describe a software framework from which gesture detection components are parametrically defined and applied to depth images with a human in the field of view. Using depth data, the framework can quickly switch between a wide view and a narrow view, depending on whether or not the person in view is performing the gesture in the xy-plane of his or her body or in front of his or her body. Gesture recognition is based on extracting local maxima and local minima of border pixels of the region of interest. Gesture detection examples include finger-counting, waving, "come here", and "It's hot!", which can be applied to robot systems that require human-robot interaction via robot vision.