Anthropological Science
Online ISSN : 1348-8570
Print ISSN : 0918-7960
ISSN-L : 0918-7960
Reviews
Excised larynx experimentation: history, current developments, and prospects for bioacoustic research
MAXIME GARCIACHRISTIAN T. HERBST
著者情報
ジャーナル フリー HTML

2018 年 126 巻 1 号 p. 9-17

詳細
Abstract

The study of sound production mechanisms is a crucial, yet understudied, aspect of vocal communication research in vertebrates. In excised larynx experimentation (ELE), phonation is simulated ex vivo by forcing air through a larynx specimen mounted on a laboratory bench. The method provides unique insights into vocal production and allows inference of in vivo conditions. Here, we provide a historical overview of how this technique has been implemented, from antiquity to current state-of-the-art setups. We review the advances made by applying ELE to human voice and biophysics research. We then highlight the promising research output resulting from ELE in animal bioacoustics, a research field that has largely overlooked the use of this method until very recently, but is now increasingly relying on this tool. We continue by discussing the limitations of ELE, depending on the focus of investigation. Finally, we suggest how this approach should be implemented and can be applied to various research questions. We conclude by underlining the value that ELE contributes to the comprehension of human voice as well as mammalian and avian vocal communication within an interdisciplinary approach.

Introduction

An important aspect of primate research is the investigation of vocal communication within and across species of this taxon (Seyfarth and Cheney, 2010). On a physical level, vocal signaling is established via sound production, propagation, and reception (Bradbury and Vehrencamp, 2011). In humans, non-human primates, and most other mammals, sound production is governed by a universal physical principle, described in the myoelastic aerodynamic (MEAD) theory (van den Berg, 1958; Titze, 2006; Herbst et al., 2012): steady airflow, coming from the lungs, is converted into a sequence of airflow pulses by the passively vibrating vocal folds (or other laryngeal tissue), resulting in self-sustained oscillation. The acoustic pressure waveform generated by this sequence of flow pulses excites the vocal tract, which filters them acoustically, and the result is radiated from the mouth (and/or the nose) (see Herbst (2016) for a review). The latter phenomenon, involving individual contributions from both the laryngeal sound source and the vocal tract to the quality of the emitted sound, has been described by the source–filter theory of sound production (Chiba and Kajiyama, 1941; Fant, 1960) and its non-linear extension (Flanagan, 1968; Titze, 2008). The fundamental frequency (fo) and formants are two major parameters in human voice that are explained by this theory, which correspond to the rate of vibration of the vocal folds in the larynx, and to acoustic filtering by the vocal tract, respectively.

Acoustic recordings typically constitute the simplest approach for documenting and assessing sound production. However, this investigation paradigm suffers from several disadvantages: for one, acoustic (field) recordings of non-human mammals can be polluted by ambient background noise and vocalizations from conspecifics (Fischer et al., 2013). More importantly, as outlined by the source–filter theory, the radiated sound is a convolution of the sound generated within the larynx and the frequency-dependent filtering effects of the vocal tract. Consequently, it is challenging to isolate the individual contributions of these two voice subsystems (Drugman et al., 2014), particularly when the recording has not been made in a sound-treated room devoid of background noise. Finally, the radiated acoustic sound is only the end product of the sound generation process. It is thus difficult (and often impossible) to reliably infer the physiological and physical sound production parameters such as subglottal pressure (Schutte, 1980), contraction states of intrinsic laryngeal muscles (Ohala et al., 1968), or the laryngeal phonatory configuration (Herbst et al., 2011) from the emitted sound alone.

For these reasons, the dynamic physical and physiological phenomena occurring in the larynx need to be observed directly, in order to better understand the peripheral aspects of sound production. In humans, the sound source can be directly assessed with, for example, videoendoscopic recordings, either using stroboscopy (Bless et al., 1987), kymography (Švec and Schutte, 2012), or techniques relying on high-speed video (Farnsworth, 1940; Honda et al., 1985; Hertegard, 2005; Deliyski and Hillman, 2010). Other approaches are constituted by measurement of the time-varying glottal air flow (Rothenberg, 1977), by indirect monitoring of vocal fold vibration with electroglottography (Baken, 1992), photoglottography (Sonesson, 1959), or by measurement of skin acceleration at the level of the larynx (Švec et al., 2005).

While application of these measurement methods is challenging in humans, there are even greater difficulties when investigating living animals: unlike humans, it is hard to guide them to produce vocal signals following specific rules. In addition, during in vivo voice production, several physiological voice parameters are typically varied simultaneously. Controlled investigation of the effects of individual factors ceteris paribus are thus typically not possible in vivo. For these reasons, researchers have resorted to ex vivo investigation of voice production with excised larynges, i.e. the investigative approach that will be discussed in the remainder of this paper.

Excised larynx setup

A schematic illustration of an excised larynx experimental setup is provided in Figure 1. In brief, a larynx is harvested (ideally immediately post mortem) from a deceased animal or human. It is mounted at the extremity of a tube supplying heated and humidified air. The vocal folds are adducted (in order to approximately reconfigure the glottis, i.e., the air space between the vocal folds) either with prongs (Durham et al., 1987) or with strings and pulleys (van den Berg et al., 1960) to a position suitable for self-sustaining (passive) oscillation of the vocal folds. Typically, the magnitude of the supplied air pressure and/or the longitudinal tension of the vocal folds are varied, simulating the two chief control parameters for laryngeal voice production (Titze, 2000). The ensuing mechanical and acoustical phenomena can then be documented by several means, typically involving acoustic and electroglottographic data acquisition and high-speed video recording. The strength of this technique lies in the comparability of sounds produced ex vivo (with a fine control of the aforementioned parameters) with those produced in vivo, because in both cases sound production is governed by a similar physical principle, i.e. MEAD. Detailed instructions for establishing an excised larynx setup are provided in Durham et al. (1987) and Titze (2006).

Figure 1

Schematic illustration of an excised larynx setup, as utilized in Herbst et al. (2012). The system for synchronizing the different signal streams is described in detail in the supplementary materials in Herbst et al. (2014).

Excised larynx experimentation: historical background

It was Galen in antiquity who, by sectioning the laryngeal nerve in a piglet, first demonstrated that the larynx is responsible for generating the voice (Gross, 1998; Kaplan et al., 2009). Since then, various scientists have shown an interest in using human and animal larynges to investigate voice production. A review by Cooper (1986) summarizes the key steps that have led excised larynx experimentation (ELE) to reach the standards found in modern setups.

Without going into as much detail as Cooper did, some milestones in the history of ELE are worth reiterating. Galen’s work was continued by Leonardo da Vinci (Harrison, 1995), who is assumed to have phonated human larynges by manipulating the lungs, trachea, and larynx (Brown and Riede, 2017). The first clear evidence that sound results from vocal fold vibration came from Ferrein in his memoir ‘De la formation de la voix chez l’homme’ (Ferrein, 1741). The method he used at the time consisted in blowing air through a tube connected to the trachea either directly or using bellows, depending on the species specimen and the force needed to set the vocal folds into motion. He observed and characterized vocal fold vibration either with the naked eye or using a magnifying glass. While this level of control was only qualitative, it allowed key discoveries, such as the crucial insight that vocal fold vibration was the mechanism responsible of sound generation; that shifts in fo could be obtained by adjusting the proportions of the lengths of the vocal folds in motion; and more specifically that vocal fold tension was a major determinant of fo variation (Ferrein, 1741). This research paved the way for Müller, who was the first to initiate a truly comparative approach to sound production mechanisms (Müller, 1840).

It is with Müller’s work that ELE emerged as a new research tool for quantitative investigation in voice science. The key differences from Ferrein’s approach were the construction of an experimental setup where a larynx could be mounted, as well as calibration of vocal fold adduction and elongation (via a system of weights and pulleys) and calibration of air pressure through a manometer placed between the experimenter’s mouth and the larynx (Müller, 1840; Brown and Riede, 2017). While some work followed Müller’s in the late 19th and early 20th centuries (Cooper, 1986), it is not until van den Berg (1958), who established the MEAD theory (refuting Husson’s neurochronaxic theory), that the next major advance in ELE was made. He used a setup that benefited from technological improvements in directly monitoring vocal fold vibration (van den Berg and Tan, 1959). In his research, van den Berg strongly emphasized the biophysical processes (in particular, basic concepts of fluid mechanics (Cooper, 1986)) underlying voice production, his work and ideas setting the stage for the use of modern ELE in various research fields and towards different objectives.

Application of excised larynx experimentation to human voice and biophysics

Over the past six decades, most research involving excised larynges has been aimed specifically at the investigation of voice production in humans. Surprisingly, in a large number of these experiments, larynges of various mammalian species (e.g. dogs, pigs, sheep, cows, rabbits, or rats (Alipour and Jaiswal, 2008; Welham et al., 2009; Regner et al., 2010; Maytag et al., 2013)) were utilized as a proxy for the human vocal organ, probably for ethical reasons. It is sometimes assumed that the vocal fold vibratory characteristics and sound output data from these species are largely comparable with those of their human counterpart (Jiang et al., 2001a; Regner et al., 2010; Alipour et al., 2013). It should, however, be taken into consideration that the anatomical layout of the vocal folds (particularly that of the lamina propria) differs between these species and humans (Kurita et al., 1983; Garrett et al., 2000), thus potentially leading to different biomechanical properties of the vocal folds and consequently resulting in diverse vibratory characteristics and sound output.

Given the fine control over experimental conditions allowed by ELE, an important focus has been placed on quantifying the relationships between characteristic laryngeal voice production features. ELE has thus provided valuable insights into the causal relationship between physical boundary conditions of voice production, resulting vocal fold vibration dynamics, and the acoustic features of the produced output sound (Döllinger et al., 2011). Amongst others, work has been carried out with a strong inclination towards fluid dynamics. This includes studies investigating (a) pressure–flow relationships (Alipour et al., 1997; Hottinger et al., 2007); (b) glottal resistance (Alipour et al., 2007; Alipour and Jaiswal, 2009) and glottal efficiency (Titze, 1988); (c) velocity fields within the glottis using particle imaging velocimetry (which in simple terms corresponds to turbulence analysis) (Khosla et al., 2014; Oren et al., 2014); (d) the dependency of fo on subglottal pressure (Solomon et al., 1994; Alipour and Scherer, 2007; Alipour and Jaiswal, 2008); and (e) the relationship between subglottal pressure and non-linear dynamics of laryngeal voice production (defining the phonation instability pressure, PIP) (Jiang and Titze, 1993; Jiang et al., 2003). Further research emphasis has been directed towards connecting vocal fold motion and geometry with the acoustic signal produced: this includes, amongst others, studies investigating (a) nonlinear dynamics in relation to glottal geometry (vocal fold adduction and/or elongation) (Berry et al., 1996; Jiang et al., 2003) and vocal fold asymmetry (Giovanni et al., 1999); (b) the importance of tissue properties on phonatory characteristics (Chan and Titze, 1999; Alipour et al., 2011); (c) mucosal wave propagation on the vocal folds (Kusuyama et al., 2001; Jiang et al., 2008); and (d) the mechanical forces applying on the vocal folds during phonation (Verdolini et al., 1998; Jiang et al., 2001b; Bakhshaee et al., 2013). This biophysical approach has often also proved useful for physical modeling of voice production (Titze, 1984, 1989; Alipour-Haghighi and Titze, 1991; Tokuda et al., 2007). In line with this view, hemi-larynx setups have been designed, where the larynx is cut medially, with one half of the larynx removed, and where vocal fold oscillation is recorded both from the sagittal plane and from above (Baer, 1981; Durham et al., 1987; Jiang and Titze, 1993)—see Herbst et al. (2017) for a review and video instructions. Introduction of the hemi-larynx setup has facilitated significant advances in the three-dimensional reconstruction of vocal fold dynamics (Döllinger and Berry, 2006a, 2006b).

From a more applied perspective, ELE has led to advances in the understanding of the human singing voice through studies investigating register breaks and octave jumps (frequently occurring when singers perform) (Švec et al., 1999; Alipour et al., 2007). ELE has also aimed at improving medical phonatory research (Baer, 1981), given the specific voice profile associated with particular voice disorders. Some studies have therefore oriented ELE towards applications to investigate mechanical stress or specific pathologies such as nodules and scarred vocal folds (Jiang and Titze, 1994; Titze, 1994; Jiang et al., 2003). This bridge towards biomedical research has a crucial importance as it lays the ground for developing clinically valuable methodologies for the diagnosis of voice disorders (Giovanni et al., 1999; Jiang et al., 2003). A very recent study illustrating this point was carried out by Latifi et al. (2016), in which the authors designed, built, and validated a vocal fold bioreactor system. This constitutes a significant improvement in the investigation of voice production ex vivo, as well as for developing tissue engineering and testing treatments to be applied to voice disorders (Fishman et al., 2016).

Progressive use of excised larynx experimentation in a parallel area of research: bioacoustics

As mentioned above, the research described in the previous section highlights the fact that non-human mammal larynges have mainly been used as a model to represent human larynges and infer the mechanisms at work in our own species, without giving much attention to sound production in non-human mammals. This approach dates back to the early days of ELE. Ferrein, for instance, included pig and dog larynges in his landmark research (Ferrein, 1741). It is surprising that comprehensive application of ELE to bioacoustics, i.e. the field of research dedicated to understanding animal vocal communication, started only very recently. The following examples illustrate the variety of research questions that can greatly benefit from ELE; while this list is not exhaustive, it gives an overview of most of the bioacoustics research carried out using ELE to date, highlighting the need to promote and encourage the use of ELE in this research area.

In the Introduction, we outlined the biological relevance of using ELE to infer in vivo sound production, based on the MEAD principle. In a recent study by Elemans and colleagues (Elemans et al., 2015), the authors carried out ex vivo experiments on the avian vocal organ (in which the syrinx is the sound-producing organ, in place of the larynx in mammals). Using seven species of this clade, they consistently showed that MEAD was the mechanism employed to produce sounds in birds, demonstrating for the first time that a universal sound production mechanism exists in birds and mammals. This study illustrates how ELE can bring insight into fundamental research questions and thus improve our general understanding of animal vocal communication systems.

On a more specific level, similar investigation has proven useful to elucidate the mechanisms governing the production of (literally) extraordinary animal vocalizations. Some species communicate at frequencies that lie beyond our human perceptual abilities, namely by using infrasonic (i.e. frequencies too low for human hearing) or ultrasounic (i.e. frequencies too high for human hearing) vocalizations. Focusing on species that vocalize at these two extremes of the frequency range relevant in the animal realm, researchers found evidence that the production of infrasonic and ultrasonic vocalizations could be explained by the same mechanisms as those used in the human voice. For instance, Herbst and colleagues showed that infrasonic elephant rumbles result from MEAD-generated vocal fold oscillation, and that the infrasonic character of these oscillations is causally linked to the size and three-dimensional movements of these vocal folds (Herbst et al., 2012, 2013b). A similar conclusion was drawn regarding the production of ultrasonic vocalizations in rats by using ELE (Johnson et al., 2010) and intratracheal pressure and airflow measurements in anesthetized animals (Riede, 2011). Recently, an alternative production mechanism for ultrasounds in mice (namely a glottal jet impinging onto the laryngeal inner planar wall) has been identified (Mahrt et al., 2016). This study calls for a comparative examination of the production of ultrasonic vocalizations in rodents (as very recently initiated by Riede et al. (2017)), and also illustrates the benefit of using ELE to uncover new sound production mechanisms.

Theoretical investigation of the relationship between fo and vocal fold length predicts that the greater the resting length of the vocal fold, the lower the fo should be (Titze, 2000). This concept is particularly useful for research investigating acoustic allometry, i.e. the relationship between an individual’s body size and the acoustical characteristics of its vocalizations (Fitch, 2000; Garcia et al., 2017). Focusing on fo and following this principle, it is expected that larger individuals should have larger larynges, and hence longer vocal folds, which should produce lower fo. In mammals, this negative relationship between fo and body size has been verified when considering a broad range of species (Fletcher, 2005; Herbst et al., 2012). However, some species appear as outliers to this general trend. For instance, howler monkeys (Dunn et al., 2015) and koalas (Charlton et al., 2013) produce vocalizations with an unexpectedly low fo given their body size. While the existence of a novel vocal organ explained this observation in koalas (Charlton et al., 2013), in the case of howler monkeys, recent work showed that the low fo of their calls was in theory in accordance with the length of their vocal folds (Dunn et al., 2015). ELE with howler species confirmed this assumption, showing that the fo of vocalizations produced ex vivo matched that of vocalizations produced in nature (Garcia et al., 2017). In this context, ELE appears an indispensable tool to evaluate the validity of hypotheses raised from behavioral observations.

While the low frequencies typical of growls and roars in big cats of the genus Panthera do not necessarily appear to be deviating from acoustic allometry rules, they nevertheless sound unusually loud compared to sounds produced in other taxa. Titze et al. (2010) phonated larynges from various tiger subspecies and demonstrated that such vocalizations are made possible by the considerable size and geometry of the bulky vocal folds in these larynges, which enable remarkable radiated vocal power. It should be noted that some ‘growl-like’ vocalizations such as human throat singing may also result from vibrations of the ventricular or aryepiglottic folds (Sakakibara et al., 2004). To investigate such cases through ELE, special attention should be paid to avoid removing inner laryngeal tissue, in order to identify potential vibrating sources (e.g. Herbst et al. (2013b) identified a 1:1 phase-locked vibration of vocal folds and ventricular folds in a elephant larynx). Another study using ELE on two red deer larynges documented increased glottal efficiency for non-regular vocal fold vibration regimes at high subglottal pressures, hypothetically suggesting an energetic advantage in animal vocal communication when converting metabolic to acoustic energy in this species (Herbst, 2014). Here again, ELE showed its potential use in understanding the acoustic characteristics and potentially the cause of vocalizations produced in nature.

Using ELE, recent research investigating vocal production in squirrel monkeys showed the value of this approach not only to learn about biomechanics (i.e. ‘how vocalizations are produced’) but also about behavior (i.e. ‘how vocalizations are used’). Brown and colleagues suggested that the rapidly varying sequences found in the natural vocal repertoire of the squirrel monkey could result from simple non-linear dynamics of vocal fold oscillatory regimes (Brown et al., 2003). Their work proposes that the complexity (in terms of variety and transition speed) of acoustic signals produced could be explained relatively simply by the non-linear properties intrinsic to the vocal organ, rather than by advanced neural control. Such non-linear properties are not uncommon, as seen, for example, in sika deer, where application of a continuously varying subglottal pressure leads to abrupt changes in vocal fold vibratory regimes—see Figure 6 in Herbst et al. (2013a). Likewise, Fee and colleagues suggested, through excised syrinx experimentation in zebra finches, that the variety of temporal and spectral arrangements characterizing songs produced in nature could be explained by the changes in oscillatory states of the vocal membranes (Fee et al., 1998). These studies using ELE are likely to have a major impact on research focusing on the cognitive processes underlying ‘complex’ forms of communication (‘complex’ here being defined as comprising a large variety of sounds or as rapid transitions between acoustic signals). We could hypothesize here that such communication systems could result from behavioral usage of the nonlinear properties intrinsic to the bird or mammalian vocal organ, rather than acute control over their sound-producing organ (Fitch et al, 2002).

The comparative approach started by Müller (1840) also has great potential to improve our comprehension of the evolutionary pressures acting upon sound production. This implies a need to carry out ELE systematically across a range of species more or less closely related depending on the research question investigated. A recent study pursued this objective using larynges from 11 primate species (Garcia et al., 2017). While establishing that fo reflected the length of the vocal folds, considering the entire sample available, the authors observed remarkable differences in the minimum fo produced in species with comparable vocal fold lengths. They built upon these observations to raise hypotheses regarding the potential evolutionary causes responsible for these differences, and suggested possible influences of species-specific habitat, socioecology, and the sexual-selection process on vocal apparatus, thus leading to species-dependent variation in sound production in this taxon. This work calls for follow-up studies with a special focus on evolutionary questions, as well as similar investigations in other groups likely to have evolved different communication systems based on various selection pressures (e.g. in marine mammals, where sound production is likely to be affected by sound propagation in a different medium).

Benefits and pitfalls of excised larynx experimentation

As outlined by Cooper (1986), ELE has played a major role in the study of voice physiology, by using a naturally vibrating structure with a degree of experimental control that cannot be achieved in living organisms. The above discussion illustrates the variety of benefits that can be obtained by conducting ELE. Bringing together concepts of biophysics, state-of-the-art technologies, computational modeling and comparison with in vivo vocalizations, this approach has provided fundamental insights into the mechanisms underlying human and non-human vocal production. This has been valuable to a large variety of research foci as diverse as musicology, voice science (pathology, therapy, physiology), bioacoustics, and biology (cognition, evolutionary pressures, core sound production concepts).

Unfortunately, ELE necessitates that the larynx is no longer in its original physiological condition. Considering excised larynges as a proxy, or model, of the organ in vivo, there are thus limitations when comparing ELE to in vivo sound production. Some aspects that we describe below can be highly relevant to the external validity of ELE and should therefore be kept in mind when conducting such experiments.

Unless biological material is accessible directly after death, ELE typically involves freezing and storing of the larynx for a certain amount of time prior to running experiments. In this context, studies showed that the freezing process could affect the geometry (Stevens et al., 2016) and the biomechanical properties of vocal fold tissues (Chan and Titze, 2003) and thus the resulting vibratory characteristics and the generated sound. Chan and Titze’s results suggest the application of rapid ‘flash-freezing’ using liquid nitrogen, in order to avoid significant changes in the biomechanical properties of vocal fold tissue. These findings are highly relevant for modeling purposes. However, extending them to bioacoustics and voice science seems to require additional investigation because the frequency range examined to assess the effect of the freezing process (0.01–15 Hz) was well below the range of fo of most mammalian vocalizations (see, for example, the results for some primates and carnivores reported by Bowling et al. (2017)).

Most crucially, preparation of an excised larynx (unless performed on a fully perfused larynx which is kept in physiological conditions, allowing for muscular contractions—see Berke et al. (2013) and Mendelsohn et al. (2015) for humans and Elemans et al., (2015) for birds) removes the possibility for contractions of the intrinsic laryngeal muscles. In vivo, these muscles (Zemlin, 2001) are crucial for the (pre)phonatory configuration of vocal folds and the glottis, thus causally determining the vocal fold vibration dynamics and the spectral composition of the generated sound. Some of the functions pertaining to the activity of the lateral cricoarytenoid (LCA) and the intraarytenoid (IA) muscles for adduction, and the cricothyroid (CT) muscle for vocal elongation, can be approximated with prongs, weights, and pulleys, or other devices. The contraction of the thyroarytenoid (TA) muscle, however, which is among other things responsible for shortening and thickening the vocal folds, is virtually impossible to simulate (apart from over-adducting the vocal folds at the position of the vocal processes, which might constitute an unphysiological maneuver). This highlights some limit to the use of ELE, because contraction of the TA is assumed to be mainly responsible for modal voice (‘chest register’) (Choi et al., 1993), i.e. the main glottal configuration for human speech. Direct comparison using a single larynx specimen ex vivo (i.e. with a fully perfused setup), where the simulation of intrinsic laryngeal muscle activity is rigorously assessed, is still pending.

It is thus quite easy to produce ‘non-physiological’ sound outputs from ELE, either by simulating non-physiological muscular contractions (as suggested above), or by artificially introducing non-linear phenomena (NLP) via the subglottal system employed (Zhang et al., 2006). Setups should thus maximally control for laryngeal interactions with the acoustical resonances of the subglottal system (Zhang et al., 2006).

As suggested by its name, the last and maybe the most obvious limitation of ELE is that ‘excised larynx experimentation’ is used to study sound production at the ‘laryngeal’ level. We mentioned in the Introduction that in humans and mammals in general, according to the ‘source–filter theory’ (Fant, 1960), sound production is actually the result of a two-step process involving the larynx (the sound source, mainly characterized by fo) and the upper vocal tract (mainly characterized by its resonance properties). While source-and filter-related acoustic features can be considered as independent from one another in most cases (Fant, 1960), non-linear extensions of the source–filter theory (Flanagan, 1968; Titze, 2008) indicate that source–filter interactions can occur, e.g. if the rate of vibration of the vocal folds (or an integer multiple thereof) is close to a vocal tract resonance frequency. Another, even more relevant effect is the increase of sound radiation efficiency that is achieved with an attached vocal tract, which is typically of the order of 5–10 dB (Titze, 2006). Because ELE typically involves removing the vocal tract, such interactions, as well as the filter-related characteristics of the vocal tract are difficult to investigate thoroughly with this technique.

Future research directions for excised larynx experimentation

When considering human and animal ELE investigation as a whole, the foremost challenge that remains to be addressed is to obtain baseline laryngeal sound production data from the entire range of mammalian species in order to carry out extensive comparative work. This will make an invaluable contribution to our understanding of the proximate (biophysical, physical, physiological) and ultimate (developmental and evolutionary) mechanisms shaping vocal communication systems.

A common feature in mammal vocalizations is the presence of NLP (Herzel et al., 1995; Wilden et al., 1998). We mentioned above that these non-linear dynamics can be intrinsic to the vocal organ, which implies that advanced neural control is not necessary to produce a variety of different acoustic signals. We suggested that, in this sense, ELE could substantially benefit research in cognitive biology. While the production mechanisms of NLP have been investigated to a certain extent, how NLP are perceived and used remains largely understudied and poorly understood. Some work has evaluated the functional relevance of NLP in bioacoustic research, with potential purposes such as preventing habituation, inducing perceptual changes, indicating status, individuality, motivation, and physical health (Wilden et al., 1998; Fitch et al., 2002). Exploring this avenue, studies have shown that NLP may indeed indicate physical condition (Riede et al., 2007), arousal state (Blumstein et al., 2008; Garcia et al., 2014), individuality (Volodina et al., 2006), or drawing increased conspecifics attention (Blumstein and Récapet, 2009; Townsend and Manser, 2011; Reby and Charlton, 2012; Charlton et al., 2017). Nevertheless, the connection between the potential function and the production mechanisms associated with NLP has not been investigated to date. Because ELE offers the opportunity to control laryngeal sound production experimentally, it provides a unique window for the in-depth investigation of NLP production of mammalian vocal communication systems.

We highlighted above the limitation posed by ELE in examining source–filter interactions due to the removal of the vocal tract. However, various anatomical adaptations have evolved in mammals, including the formation of vocal sacs (or air sacs) found in artiodactyls and primates, for instance (Negus, 1949). These sacs correspond to tissue chambers extruding from the larynx, which can inflate and supposedly carry out acoustic functions in sound transmission (acting like impedence-matching systems) and/or resonance (acting like Helmutz resonnators) (Fitch and Hauser, 1995) when the vocal folds are vibrating. In this specific context, ELE is a very promising tool since it has the potential to investigate source–filter interactions by using air-driven vs. heliox-driven phonation. A publication in which these concepts are applied is forthcoming from the present authors.

Biophysical and biomedical applications should and certainly will remain a key focus of ELE. While diagnosing voice disorders and evaluating clinical treatments has typically relied on assessment of voice quality (Bertino et al., 2001; Carding et al., 2004; Lee et al., 2016; Aichinger et al., 2017), much insight has yet to be gained and ELE has a valuable contribution to make in this respect. However, a transformative angle, inspired by human medical research, has yet to be taken into different fields. Assessment of voice quality has the potential to indicate voice disorders and higher pathologies in humans (e.g. cancer-related vocal fold paralysis (Billante et al., 2001)). A similar approach applied to animals would significantly help to improve welfare issues, e.g. in farmed species bred for agribusiness. ELE is, in this context, a primary tool that should be used to connect voice disorders and voice quality, in order then to serve as a veterinary application in the animal-breeding industry.

While successful ELE has been conducted using fully perfused syringes in birds (Elemans et al., 2015) and fully perfused larynges in humans (Berke et al., 2013; Mendelsohn et al., 2015), much work remains to be done to achieve similar success in other mammalian species and other taxa (e.g. amphibians). This is essential to increase the strength of conclusions drawn from ELE and applied to vocal production in vivo. In addition, using setups that include the larynx as well as the vocal tract (as was originally done by Müller in the 19th century (Cooper, 1986)) is vital to draw more representative conclusions on the effect of source–filter interactions on sound production. Ultimately, fully perfused setups that encompass the entire vocal apparatus might be used, thereby combining the two methodologies just described. Although methodologically challenging, this would strongly capitalize on both the significant role of the physiological conditions and the effect of the vocal tract, and is necessary to obtain the best accuracy possible while inferring in vivo sound production mechanisms.

Our work outlines and emphasizes the value of ELE in various research fields; however, it should be kept in mind that the integration of this technique within a multidisciplinary approach is key to achieving maximum scientific impact and transcending our comprehension of human voice and animal vocal communication systems.

Acknowledgments

This research was supported by a postdoctoral fellowship from the Fyssen Foundation (M.G.), by an ‘APART’ grant received from the Austrian Academy of Sciences (C.T.H.), and by the Research Units for Exploring Future Horizons of Kyoto University (C.T.H.).

References
 
© 2018 The Anthropological Society of Nippon
feedback
Top