日本音響学会誌
Online ISSN : 2432-2040
Print ISSN : 0369-4232
26 巻, 5 号
選択された号の論文の5件中1~5を表示しています
  • 中津井 護, 鈴木 誠史
    原稿種別: 本文
    1970 年 26 巻 5 号 p. 211-221
    発行日: 1970/05/10
    公開日: 2017/06/02
    ジャーナル フリー
    A new method of formant frequency extraction utilizing characteristic features of the vowel-type spectra is proposed and realized in the form of FORTRAN program. An experimental evaluation of the method is carried out using synthetic speech sounds which simulate various troublesome conditions encountered in formant frequency extraction of natural speech. Inverse filtering in the spectral domain is made so as to leave a simple resonance spectrum of one formant behind from an input spectrum, for schematic example, leaving H^+_2 in Fig. 4(b) behind from P in Fig. 3(b). The formant frequency is then calculated as the first-order moment. A repetition of these two processes shown in Fig. 7 gives fairly accurate formant frequencies. Extractions on five Japanese vowels by five male adults and the non-nasal voiced portions of continuous speech sounds by two male announcers are carried out. Some results of them are shown in Table 4 and Fig. 8. Here discussed are some factors that may give rise to much trouble in the formant frequency extraction. The factors based on source characteristic are the source harmonic structure, zeros of the source spectral envelope, and the gross shape differences of the source spectral envelopes. The factors based on transfer characteristic are rapid formant transitions and their contiguities. In this paper four excitation waveforms and six source fundamental frequencies (100-200 Hz) are used in the synthesis combined with the formant frequency pattern of Fig. 9. Three of the excitation waveforms are triangular as shown in Fig. 1, of which K=0. 5, 0. 7 and 1. 0 and the remaining one is impulse-type. The error distribution of the formant frequencies extracted from these synthetic sounds are shown in Fig. 11. The results of the extraction are examined in relation to the factors described above with the following conclusions reached: (1)Under many troublesome conditions the proposed method provides fairly good accuracy and extraction errors do not exceed half the source fundamental frequency in most cases. (2)The extraction program is relatively simple. The average extraction time is about 0. 23 sec. for each 10ms short-time spectrum by the general-purpose computer NEAC 2200/500(add. , 5. 2μsec. ). It is remarkably fast as compared with usual methods. (3)The results of experiment with synthetic sounds generated under various excitation conditions and natural sounds uttered by many speakers suggest that this method is applicable to various speech sounds reliably.
  • 太田 光雄, 中村 順一
    原稿種別: 本文
    1970 年 26 巻 5 号 p. 222-228
    発行日: 1970/05/10
    公開日: 2017/06/02
    ジャーナル フリー

    It is necessary to consider the first and higher order correlations between the level and the slope of an arbitrary random street noise in order to find quantitatively the acuity of its fluctuation. The expected number of level-crossing is known as one of the physical quantities determined by only two values of the level and the slope, and the investigation on the number of level-crossing at each level gives, therefore, some significant information about the acuity of the street noise fluctuation. We have shown previously that the probability density function (hereafter, abbreviated as p. d. f. ) of level-crossing can be expressed in terms of orthonormal expansion of the statistical Lauergre series. However, the problem how to choose the origin of level axis remains unsettled in above method. Namely, as the result that null phon which is not measured practically in any street noise should be taken as the level origin, the above expansion expression of the statistical Laguerre series may lack the reliability. If we choose the mean of level calculated from many sample points of the street noise(measured by phon) as the level origin, the reliability will be raised. Then, we must consider that even the street noise fluctuating only in positive region fluctuates in both positive and negative sides of the origin thus taken. From a theoretical viewpoint, the statistical treatment shown previously in the field of surface roughness can be used. In this paper, the application of the theory to street noise, its experimental consideration and the comparison between experiment and theory are reported newly. First, a joint p. d. f. P(X, Y) of the instantaneous value X and the slope Y in an arbitrary random wave fluctuating in positive and negative regions are expressed by Eqs. (1) and (2) in terms of the statistical Hermite orthonormal expansion series. If we take &ltX&gt=&ltY&gt=0, σ_X^2=&ltX^2&gt and σ_Y^2=&ltY^2&gt, it is shown that β(0, 0)=β(0, 1)=β(2, 0)=β(0, 2)=0 and that β(1, 1) corresponds to linear correlation coefficient (ρ(X, Y)). The higher order correlations are reflected into the expansion coefficients as shown in Eq. (4). The relations (1) and (2) can be applied to the street noise, whose mean value is chosen as the origin, fluctuating on both positive and negative sides of the origin chosen. The expected number M(X) of X per second passing through a level X with positive and negative slope is given by M(X)=∫^∽_&lt-∽&gt│Y│P(X, Y)dY. the p. d. f. M_0(Z) of level-crossing at level Z is expressed by Eq. (5), where Z=X/σx and the expansion coefficients are given in Eq. (6). In the same manner, the expected number N(X) of level-crossing with only positive slope is given by N(X)=∫^∽_0YP(X, Y)dY and can be led to the expressions (8) and (9) as the p. d. f. N_0(Z) of level-crossing at level Z by the aid of integral formulas(7). On the other hand, the p. d. f. P(Z) of the instantaneous value of level fluctuation can be obtained by Eq. (10). Thus, the difference ε_M(Z) or ε_N(Z) between the cumulative probability distribution of the instantaneous level fluctuation and that of level-crossing is expressed as Eq. (11) or Eq. (12) in the similar expansion form. It is worthy to note that the expansion coefficients of Eq. (11) or Eq. (12) do not contain β(3, 0)(denoting the skewness) and β(4, 0)(denoting the kurtosis), but contain only the linear and nonlinear correlations between X and Y. An information about the acuity of the street noise fluctuation is given in each expansion coefficient. Then, the expansion coefficients can be estimated experimentally by method of moment as shown in Eqs. (14) and (16), where&lt &gt, &lt &gt' and &lt &gt" denote the averaging treatment about P(Z), M_0(Z) and N_0(Z) respectively. Finally, it is proved from experiment as shown in Figs. 1 to 3 that the consideration on the first and

    (View PDF for the rest of the abstract.)

  • 要 祐一
    原稿種別: 本文
    1970 年 26 巻 5 号 p. 229-234
    発行日: 1970/05/10
    公開日: 2017/06/02
    ジャーナル フリー
  • 菊池 喜充
    原稿種別: 本文
    1970 年 26 巻 5 号 p. 235-239
    発行日: 1970/05/10
    公開日: 2017/06/02
    ジャーナル フリー
  • 石井 泰
    原稿種別: 本文
    1970 年 26 巻 5 号 p. 240-245
    発行日: 1970/05/10
    公開日: 2017/06/02
    ジャーナル フリー
feedback
Top