This paper describes on a neural approach intending to improve performance of a voice recognition device by using not only voice-sound features but also image features of the mouth shape. FFT power spectrum Is used as the voice feature. In addition to it, gray level image, binary image, and its geometrical shape features of the mouth are tested for comparison to check which kind of features are effective for the voice recognition by a neural network. We obtained recognition rate of about 80% using only voice features, and 92% using voice features plus binary image for unrestricted speakers. but also to communication-aid for hearing impaired people.