2015 Volume 36 Issue 6 Pages 467-477
This paper describes speech processing work in which articulator movements are used in conjunction with the acoustic speech signal and/or linguistic information. By ``articulator movements,'' we mean the changing positions of human speech articulators such as the tongue and lips, which may be recorded by electromagnetic articulography (EMA), amongst other articulography techniques. Specifically, we provide an overview of: i) inversion mapping techniques, where we estimate articulator movements from a given new speech waveform automatically; ii) statistical voice conversion and speech synthesis techniques which use articulator movements as part of the process to generate synthetic speech, and also make it intuitively controllable via articulation; and iii) automatic prediction (or synthesis) of articulator movements from any given new text input.