日本音響学会誌
Online ISSN : 2432-2040
Print ISSN : 0369-4232
14 巻, 2 号
選択された号の論文の15件中1~15を表示しています
  • 吉久 信幸
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 97-101
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    The jump phenomena in loudspeakers have been discussed in only a few literatures. When the jump phenomena occur, the amplitudes of loudspeakers would become suddenly very large and the greatest distortions would occur then. So these phenomena are deemed very important in practical use. In this paper, the jump phenomena in loudspeakers are treated theoretically and experimentally. If a voltage, Vcosωt, be applied across the voice coil, the resulting motion is assumed to be described by the equations Mx+Rx+sx+qx^3=Bli (1) R__ci+Blx=Vcosωt, (2)where M=mass of the cone and coil and air load, R=mechanical resistance, s+qx^2=stiffness of the suspension system, B=magnetic flux density in the air gap, l=length of the voice coil, i=current in the voice coil, R__c=d. c. resistance of the voice coil. The maximum applied voltage free from the jump phenomenon has been calculated by using the above equations. The result is V^2=16r^3/(27ck^2){9ar+9r^3+(a+3r^2)√<12a+9r^2>, (3) where r=R/M+<B^2><l^2>/(MR__c) a=s/M c=q/M k=Bl/(MR__c). The following conclusions have been obtained from the results of equation (3) and some experiments. In order to reduce the occurrence of the jump phenomena, (1) it is absolutely necessary that the linearity of cone suspension system is very good, that is the value of q in stiffness s+qx^2 is minimized; (2) it is necessary that the values of mechanical resistance R and s in stiffness are large; and also (3) it is desirable to lessen the value of the mass M.
  • 池田 拓郎
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 102-106
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    Observation of the electric current through the motional admittance of a piezoelectric vibrator is made here by compensating the damped admittance by a condenser with a differential transformer. This current reaches a maximum at the resonant frequency f__0, and its value is remembered by another circuit with a reference resistance. Next, the voltage from a carrier-suppressed balanced-modulator with a carrier frequency f__0 and a modulating frequency f__A is applied on the motioanal admittance. When f__A is adjusted so that the current becomes 1/√2 times the one through the reference resistance, f__A equals to a half of the quadrantal-frequency-difference. The Q value, therefore, is computed from f__0/(2f__A). The wave form of the current is identical with that of the output voltage of the balanced modulator, but, when the carrier frequency deviates from the resonant frequency, its variation becomes very sensitive. The motional admittance at resonance and damped admittance are also known by this procedure, and the electromechanical coupling coefficient can be calculated. Further studies are made on several other points.
  • 梅田 規子
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 106-111
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    Some sentences were read two NHK announcers of different sex and their voices were analyzed by the Sonagraph of Kay Electric Co. The analysis was chiefly made in regard to the influence of the preceding and following consonants and syllables to the vowels. The results show that (1) the vowels of female voice show a much greater deviation than those of male, (2) in spite of that, parallel phenomena occur between the vowels of male and female, (3) the tongue's height does not always decide the frequencies of the first formants, but some other factors also have to do with this, and (4) the most important thing is that the five vowels have their own features with regard to the Gestalt of their spectra, although a few overlappings of frequency among them are seen.
  • 斎藤 収三, 加藤 勝洋, 寺西 昇
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 111-116
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    Temporal patterns of the fundamental frequencies of Japanese speech voice are measured, and thereby distributions, power spectrum and transitional probability of the fundamental frequency are calculated. Results induced are as follows; (1) as the mean value of the fundamental frequency become higher, the variance increases, and the variances in skilled speakers such as announcer are generally larger than those of normal speakers. ( 2) 98% of energy of pitch signal can be transmitted in 5 c. p. s. band width, provided gaps between pitch patterns are filled in smoothly. (3) simple relations exist between utterance speed and power spectrum. (4) differences of succeeding pitch signals (transitional probability) can be approximated to exponential distribution and negative difference is predominant than positive. Then the estimate of optimum characteristics of the filter in pitch signal transmission, considering the effect of linear distortion of the filter upon perception, and also the probability of the duration time of the singular fundamental frequency are discussed here.
  • 佐藤 利男
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 117-122
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    Differences in time structures of voiced and unvoiced stop consonants contained in Japanese monosyllables were investigated. Perceptual tests were conducted with several modified stop consonants formed by partially cutting and rearranging tape recordings of stop consonants. Conclusion obtained from the results of the tests is as follows: Primary cues to distinguish |b|, |d|, |g| from |p|, |t|, |k| are (1) presence of preceding buzz bar, (2) rising characteristic of pitch frequency of adjacent vowel, (3) transition of first formant, and secondary cues are weaker burst intensity and absence of aspiration.
  • 馬淵 邦子, 八木 昌子, 遠藤 恵子, 大泉 充郎
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 123-128
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    Noises with different frequency bands, periods of duration and relative intensity levels to the following vowels were combined with artificial vowels to synthesize artificial voices. The regions of Japanese voiceless consonants were roughly determined through the hearing tests of these artificial voices by 9 skilled and 86 unskilled peoples. In general the regions of [ts], [t&lmoust;] and [t] were observed as distributed in higher frequency bands than that of [k] and [p]. As to the products of intensity levels and periods of duration, [t] and [p] had smaller values on them than [ts], [t&lmoust;] and [k]. Fricative voices were heard when noises were made to build up gradually and the periods of duration were made long, in which case, at higher frequency bands they were heard like [s], while at lower bands like [h]. On the other hand, the effect of following vowel on the regions of each consonants, correlation between intensity and period of duration, tendency of confusion and the conditions for sounding like sonants and fricatives were studied through the results. As to the personal differences, they were divided into 3 types and the distribution were also studied.
  • 藤村 靖, 小川 智哉, 比企 静雄
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 129-137
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    A new pattern playback system has been devised, and the preliminary experiment with a tentative synthesizer has shown very promising results. The function of the frequency dividing networks (filter bank) is achieved simply through the mechanical resonances of the crystals, which at the same time serve as the erectro-optic converters (light valves). The present simple set has 9 main channels (timbre control) and a subsidiary source (pitch and hiss) control channel. Simple Japanese sentences have been synthesized. A design for the analyser-synthesizer (coded speech recorder) is given, and the application to the vocoder without electrical filter bank is suggested.
  • 関 英男
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 138-142
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    A new method of narrow band speech transmission discussed here is very similar to the so-called VOBANC of Bell Telephone Laboratories and was devised by the author in 1955. It depends on the principle of dividing each formant frequency into an exact integral ratio at the sending end and of multiplying them to recover completely the original frequencies at the receiving end. Particular type of variable filters for synthesizer was also proposed. Some difficulties in realizing the actual communication system were discussed and some ideas of improvement were given. Finally, we wish to propose a system called "PARVOC", or partial voice compressor.
  • 田宮 潤, 平松 啓二
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 143-150
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    The infinitely clipped speech investigated by Licklider and others shows that the information of speech is sufficiently contained in that of zero-crossing points of speech wave. From this point, a new voice communication system in which each position of zero-crossing is transmitted by pulse is thought out. However, the infinitely clipped speech itself can not be used for this purpose, but it is necessary to depress the noise in intervals of speech pause and to definitely convert the up- and down- ward crossing points of clipped speech into unipolar pulse train which is suitable to transmit in a band-limited channel. The noise suppression in pause may be achieved by means of slicing speech waves at a certain off-zero level that is higher than the noise amplitude. The articulation score of sliced speech at various levels, Fig. 9, shows that -30 to -40 db (db below the average peak amplitude) are optimum for the slice level in order to keep the sufficient intelligibility and to avoid the noise. The high frequency bias has a similar effect to this off-zero slicing in a sense and helps the function of slicer. The ambiguity of crossing points caused by the following reason makes another troublesome source of noise in the reproducing process from unipolar pulse train. When a maximum or minimum part of the wave just reaches the slicer level, the output of normal slicer has a form of imperfect or very narrow rectangular wave, from which such suitable position pulses being able to transmit through a band-limited channel could not be acquired. To overcome this effect, we use a slicer of modified mono-stable multivibrator having a time constant of τ, and then the speech wave is converted into such a rectangular wave train that the minimum duration between flip-over is limited to τ. It should be noted that the maximum τ without the deterioration of intelligibility is approximately 0. 18ms. which corresponds to the third formant frequency of speech sound. In this paper we discussed the above mentioned considerations of sliced speech and their applications for various voice communication systems.
  • 福村 晃夫
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 151-158
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    After finishing the course of our quality study of speech sounds, we propose here the quality problem of synthesized sounds having formant-like spectra. The immediate aim of our present experiment, is devided into two: one is to determine the discriminatory bounds in quality response in regards to the changes in formant position and formant band width; the other is to illuminate the mechanism of the formation of so-called mental map by which the identification of several sounds which differ in both position and width of formant is established. In order to synthesize a formant-like complex sounds, damped sinusoids with repetition rate of 200 waves per second are made in use by introducing a narrow pulse-train through a single resonant circuit. The following points can be safely concluded as basic properties in timbre judgement. (1) The ratio of the insensitivity σ__<f_0 > to the half-power band width B regarding the formant frequency, defined by the standard deviation of distribution of equal judgement in AB listening test, is approximately constant regardless the resonant frequency f_0. (2) The ratio of the insensitivity σ_B to the formant band width B is generally constant. (3) When considering the confusion matrix rather from the discriminatory view-point, than from the view-point of identification, the insensitivities can be obtained from the confusion data in two directions, i. e. , in outgoing and incoming one. These are far greater than that observed in the discrimination test of pair-comparison type. A member of sound group does not necessarily have same values of insensitivities in two directions. (4) Definiteness of qualities of formant-like sounds, which are correctly identified, cannot be reflected directly on the magnitude of insensitivity defined from the discriminatory view-point.
  • 佐藤 利男
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 159-164
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    Speech sound articulations, each vowel articulation and confusions between consonants obtained by the articulation tests on speech transmission systems utilizing two or three bandpass filters which have passbands corresponding approximately to the bands of formants in Japanese vowels are discussed here. (1) Sound Articulation : Maximum sound articulations of the systems were calculated by the method of calculation using frequency importance function of articulation, and the results were compared with the measured values. In a system possessing an isolated passing band in high frequency region as the system corresponding to vowel /i/, a significant disagreement is observed. This disagreement seems to indicate that the contribution of the frequence band in question to the articulation is greater when it is in co-existence with the intermediate frequency band than while it is isolated. (2) Vowel Articulation : For a transmission system possessing passbands corresponding to the formants of a vowel, the articulation of the vowel obtained does not show much improvement over the articulation of other vowels after passing the same system. For systems corresponding to vowels other than /i/, the articulations of all the vowels obtained after passing each one of the systems are all above 0. 8, whereas the values obtained for the system corresponding to /i/ are all much lower than these values. This seems to indicate that discrimination between all vowels does not depend upon the recognition of the position of formants but on the detection of the difference in frequency spectrum with in a limited frequency band. (3) Confusion between Consonants When the consonants are grouped as (1)k, p, t, s, h, etc. , (2)g, b, d, r, z, etc. , and (3)m, n, w, y, etc. , the confusion between the consonants of the same group is great, whereas the confusion between the consonants belonging th different groups is rare. This seems to indicate that the consonants belonging to each of the groups possess some characteristic property to that group which is not lost by limitation of passing band.
  • 越川 常治
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 164-169
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    In order to measure the telephone transmission quality, the scale of naturalness which differs form that of articulation or intelligibility, was considered. Using this, evaluation was made on the effects of the factors of the transmitting system containing the distortions of filtering and non-linearity. In this report, the measuring scale for the naturalness of a certain talker was determined as the degree of difference of the talker's voice from the undistorted reference state of his original voice. For scaling of such qualities, we used the Thurstone's distance scale which is known in psychometrics. Experiments were performed with the two kinds of the filtering distortion systems, low pass and high pass, and with the two kinds of non-linear distortion, 2nd and 3rd harmonic, systems. The results of these experiments were compared form the view point of the naturalness by means of the common distance scale.
  • 吉田 登美男
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 170-174
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    In this report, the scaling of higher qualities of stereophonic sounds are treated. The concept of the quality treated here comprises of seven different characters; vividness, clearness, separation of each sound from numerous sound sources, separation of the signal from the noise, feeling of offensive reverberation, feeling of presence, and feeling of the distance of the sound source. At the beginning, the author made the listeners clarify, unify, and fix the concept of the quality in their psychological domains following careful procedure, and then gave stimulative sounds to them and demanded them to make comparative judgements to the sounds. The stimulative sounds comprised of seven varieties-a 2 channel reproduction, a 1 channel 2 speaker reproduction, and five mixed 2 speaker reproductions. The word "mixed reproduction" here means that the reproduced signals are obtained by artificially making the 2 channels mutually cross-talk by the same amount to one another. The mixing level is defined as the power ratio of the cross-talk signal to the original signal, the mixing levels employed here being 13, 8, 5, 4, and 3 db. "The method of paired comparison" is employed in the experiment, and the results are calculated by Thurstone's method (case III). As a result, the seven scales in each qualities are obtained. These scales indicate the degree to which the stereophonic sounds stood at advantage over the usual 1 channel reproductions by the distance on the psychological scales. For example, on the "vividness scale", we can see that the "2 channel" reproduction is ranked as the most vivid, the "3 db mixing" as the least vivid, with the "1 channel" reproduction intermediate between them. The psychologically evaluated distance from "1 channel" to "2 channel" is about 1. 3. These scales are useful not only in evaluating the qualities of the reproduction systems by the distances between the reproduction systems on each quality scale, but also in obtaining quantitative relation between the qualities by counting the correlation factors between the scales. The details will be given in succeeding reports.
  • 大西 雅雄
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 175-180
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
    The present writer insists that, as a fundamental matter, "noise" belongs to the category of "natural sounds" and "voice and phoneme" to the line of "speech sounds", in another words that the former is the concrete, outward, physical phenomena while the latter is the abstractive, inward, psychological image. The main reason of these discrimination comes from:1) the existence or non-existence of "contents", i. e. , "linguistic meaning" in the background of sounds. 2) the establishment or non-establishment of "auditory conventions" which often allows to cause some distortions of sounds or so called phonetic changes. 3) "natural sounds" are, as its nature, universal or international but "speech sounds" are limited to national or individual language. Anyway, there is no existence of "sounds" itself besides the existence of man, or more exactly the existence of ears, in the world. Even the most scientific experiment of acousticians would have to employ the judgement of auditory organs at its last stage.
  • 関 英男, 切替 一郎, 五十嵐 寿一, 佐藤 英男, 実吉 純一, 丹羽 登, 能本 乙彦, 佐多 直康, 奥山 政高, 田淵 大作, 伊 ...
    原稿種別: 本文
    1958 年 14 巻 2 号 p. 181-205
    発行日: 1958/06/30
    公開日: 2017/06/02
    ジャーナル フリー
feedback
Top