THE JOURNAL OF THE ACOUSTICAL SOCIETY OF JAPAN
Online ISSN : 2432-2040
Print ISSN : 0369-4232
Volume 14, Issue 2
Displaying 1-15 of 15 articles from this issue
  • Nobuyuki Yoshihisa
    Article type: Article
    1958 Volume 14 Issue 2 Pages 97-101
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    The jump phenomena in loudspeakers have been discussed in only a few literatures. When the jump phenomena occur, the amplitudes of loudspeakers would become suddenly very large and the greatest distortions would occur then. So these phenomena are deemed very important in practical use. In this paper, the jump phenomena in loudspeakers are treated theoretically and experimentally. If a voltage, Vcosωt, be applied across the voice coil, the resulting motion is assumed to be described by the equations Mx+Rx+sx+qx^3=Bli (1) R__ci+Blx=Vcosωt, (2)where M=mass of the cone and coil and air load, R=mechanical resistance, s+qx^2=stiffness of the suspension system, B=magnetic flux density in the air gap, l=length of the voice coil, i=current in the voice coil, R__c=d. c. resistance of the voice coil. The maximum applied voltage free from the jump phenomenon has been calculated by using the above equations. The result is V^2=16r^3/(27ck^2){9ar+9r^3+(a+3r^2)√<12a+9r^2>, (3) where r=R/M+<B^2><l^2>/(MR__c) a=s/M c=q/M k=Bl/(MR__c). The following conclusions have been obtained from the results of equation (3) and some experiments. In order to reduce the occurrence of the jump phenomena, (1) it is absolutely necessary that the linearity of cone suspension system is very good, that is the value of q in stiffness s+qx^2 is minimized; (2) it is necessary that the values of mechanical resistance R and s in stiffness are large; and also (3) it is desirable to lessen the value of the mass M.
    Download PDF (436K)
  • Takuro Ikeda
    Article type: Article
    1958 Volume 14 Issue 2 Pages 102-106
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    Observation of the electric current through the motional admittance of a piezoelectric vibrator is made here by compensating the damped admittance by a condenser with a differential transformer. This current reaches a maximum at the resonant frequency f__0, and its value is remembered by another circuit with a reference resistance. Next, the voltage from a carrier-suppressed balanced-modulator with a carrier frequency f__0 and a modulating frequency f__A is applied on the motioanal admittance. When f__A is adjusted so that the current becomes 1/√2 times the one through the reference resistance, f__A equals to a half of the quadrantal-frequency-difference. The Q value, therefore, is computed from f__0/(2f__A). The wave form of the current is identical with that of the output voltage of the balanced modulator, but, when the carrier frequency deviates from the resonant frequency, its variation becomes very sensitive. The motional admittance at resonance and damped admittance are also known by this procedure, and the electromechanical coupling coefficient can be calculated. Further studies are made on several other points.
    Download PDF (499K)
  • Noriko Umeda
    Article type: Article
    1958 Volume 14 Issue 2 Pages 106-111
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    Some sentences were read two NHK announcers of different sex and their voices were analyzed by the Sonagraph of Kay Electric Co. The analysis was chiefly made in regard to the influence of the preceding and following consonants and syllables to the vowels. The results show that (1) the vowels of female voice show a much greater deviation than those of male, (2) in spite of that, parallel phenomena occur between the vowels of male and female, (3) the tongue's height does not always decide the frequencies of the first formants, but some other factors also have to do with this, and (4) the most important thing is that the five vowels have their own features with regard to the Gestalt of their spectra, although a few overlappings of frequency among them are seen.
    Download PDF (650K)
  • Shuzo Saito, Katsuhiro Kato, Noboru Teranishi
    Article type: Article
    1958 Volume 14 Issue 2 Pages 111-116
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    Temporal patterns of the fundamental frequencies of Japanese speech voice are measured, and thereby distributions, power spectrum and transitional probability of the fundamental frequency are calculated. Results induced are as follows; (1) as the mean value of the fundamental frequency become higher, the variance increases, and the variances in skilled speakers such as announcer are generally larger than those of normal speakers. ( 2) 98% of energy of pitch signal can be transmitted in 5 c. p. s. band width, provided gaps between pitch patterns are filled in smoothly. (3) simple relations exist between utterance speed and power spectrum. (4) differences of succeeding pitch signals (transitional probability) can be approximated to exponential distribution and negative difference is predominant than positive. Then the estimate of optimum characteristics of the filter in pitch signal transmission, considering the effect of linear distortion of the filter upon perception, and also the probability of the duration time of the singular fundamental frequency are discussed here.
    Download PDF (713K)
  • Toshio Sato
    Article type: Article
    1958 Volume 14 Issue 2 Pages 117-122
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    Differences in time structures of voiced and unvoiced stop consonants contained in Japanese monosyllables were investigated. Perceptual tests were conducted with several modified stop consonants formed by partially cutting and rearranging tape recordings of stop consonants. Conclusion obtained from the results of the tests is as follows: Primary cues to distinguish |b|, |d|, |g| from |p|, |t|, |k| are (1) presence of preceding buzz bar, (2) rising characteristic of pitch frequency of adjacent vowel, (3) transition of first formant, and secondary cues are weaker burst intensity and absence of aspiration.
    Download PDF (671K)
  • Kuniko Mabuchi, Masako Yagi, Keiko Endo, Juro Oizumi
    Article type: Article
    1958 Volume 14 Issue 2 Pages 123-128
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    Noises with different frequency bands, periods of duration and relative intensity levels to the following vowels were combined with artificial vowels to synthesize artificial voices. The regions of Japanese voiceless consonants were roughly determined through the hearing tests of these artificial voices by 9 skilled and 86 unskilled peoples. In general the regions of [ts], [t&lmoust;] and [t] were observed as distributed in higher frequency bands than that of [k] and [p]. As to the products of intensity levels and periods of duration, [t] and [p] had smaller values on them than [ts], [t&lmoust;] and [k]. Fricative voices were heard when noises were made to build up gradually and the periods of duration were made long, in which case, at higher frequency bands they were heard like [s], while at lower bands like [h]. On the other hand, the effect of following vowel on the regions of each consonants, correlation between intensity and period of duration, tendency of confusion and the conditions for sounding like sonants and fricatives were studied through the results. As to the personal differences, they were divided into 3 types and the distribution were also studied.
    Download PDF (652K)
  • Osamu Fujimura, Tomoya Ogawa, Shizuo Hiki
    Article type: Article
    1958 Volume 14 Issue 2 Pages 129-137
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    A new pattern playback system has been devised, and the preliminary experiment with a tentative synthesizer has shown very promising results. The function of the frequency dividing networks (filter bank) is achieved simply through the mechanical resonances of the crystals, which at the same time serve as the erectro-optic converters (light valves). The present simple set has 9 main channels (timbre control) and a subsidiary source (pitch and hiss) control channel. Simple Japanese sentences have been synthesized. A design for the analyser-synthesizer (coded speech recorder) is given, and the application to the vocoder without electrical filter bank is suggested.
    Download PDF (1111K)
  • Hideo Seki
    Article type: Article
    1958 Volume 14 Issue 2 Pages 138-142
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    A new method of narrow band speech transmission discussed here is very similar to the so-called VOBANC of Bell Telephone Laboratories and was devised by the author in 1955. It depends on the principle of dividing each formant frequency into an exact integral ratio at the sending end and of multiplying them to recover completely the original frequencies at the receiving end. Particular type of variable filters for synthesizer was also proposed. Some difficulties in realizing the actual communication system were discussed and some ideas of improvement were given. Finally, we wish to propose a system called "PARVOC", or partial voice compressor.
    Download PDF (579K)
  • Jun Tamiya, Keiji Hiramatsu
    Article type: Article
    1958 Volume 14 Issue 2 Pages 143-150
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    The infinitely clipped speech investigated by Licklider and others shows that the information of speech is sufficiently contained in that of zero-crossing points of speech wave. From this point, a new voice communication system in which each position of zero-crossing is transmitted by pulse is thought out. However, the infinitely clipped speech itself can not be used for this purpose, but it is necessary to depress the noise in intervals of speech pause and to definitely convert the up- and down- ward crossing points of clipped speech into unipolar pulse train which is suitable to transmit in a band-limited channel. The noise suppression in pause may be achieved by means of slicing speech waves at a certain off-zero level that is higher than the noise amplitude. The articulation score of sliced speech at various levels, Fig. 9, shows that -30 to -40 db (db below the average peak amplitude) are optimum for the slice level in order to keep the sufficient intelligibility and to avoid the noise. The high frequency bias has a similar effect to this off-zero slicing in a sense and helps the function of slicer. The ambiguity of crossing points caused by the following reason makes another troublesome source of noise in the reproducing process from unipolar pulse train. When a maximum or minimum part of the wave just reaches the slicer level, the output of normal slicer has a form of imperfect or very narrow rectangular wave, from which such suitable position pulses being able to transmit through a band-limited channel could not be acquired. To overcome this effect, we use a slicer of modified mono-stable multivibrator having a time constant of τ, and then the speech wave is converted into such a rectangular wave train that the minimum duration between flip-over is limited to τ. It should be noted that the maximum τ without the deterioration of intelligibility is approximately 0. 18ms. which corresponds to the third formant frequency of speech sound. In this paper we discussed the above mentioned considerations of sliced speech and their applications for various voice communication systems.
    Download PDF (854K)
  • Teruo Fukumura
    Article type: Article
    1958 Volume 14 Issue 2 Pages 151-158
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    After finishing the course of our quality study of speech sounds, we propose here the quality problem of synthesized sounds having formant-like spectra. The immediate aim of our present experiment, is devided into two: one is to determine the discriminatory bounds in quality response in regards to the changes in formant position and formant band width; the other is to illuminate the mechanism of the formation of so-called mental map by which the identification of several sounds which differ in both position and width of formant is established. In order to synthesize a formant-like complex sounds, damped sinusoids with repetition rate of 200 waves per second are made in use by introducing a narrow pulse-train through a single resonant circuit. The following points can be safely concluded as basic properties in timbre judgement. (1) The ratio of the insensitivity σ__<f_0 > to the half-power band width B regarding the formant frequency, defined by the standard deviation of distribution of equal judgement in AB listening test, is approximately constant regardless the resonant frequency f_0. (2) The ratio of the insensitivity σ_B to the formant band width B is generally constant. (3) When considering the confusion matrix rather from the discriminatory view-point, than from the view-point of identification, the insensitivities can be obtained from the confusion data in two directions, i. e. , in outgoing and incoming one. These are far greater than that observed in the discrimination test of pair-comparison type. A member of sound group does not necessarily have same values of insensitivities in two directions. (4) Definiteness of qualities of formant-like sounds, which are correctly identified, cannot be reflected directly on the magnitude of insensitivity defined from the discriminatory view-point.
    Download PDF (901K)
  • Toshio Sato
    Article type: Article
    1958 Volume 14 Issue 2 Pages 159-164
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    Speech sound articulations, each vowel articulation and confusions between consonants obtained by the articulation tests on speech transmission systems utilizing two or three bandpass filters which have passbands corresponding approximately to the bands of formants in Japanese vowels are discussed here. (1) Sound Articulation : Maximum sound articulations of the systems were calculated by the method of calculation using frequency importance function of articulation, and the results were compared with the measured values. In a system possessing an isolated passing band in high frequency region as the system corresponding to vowel /i/, a significant disagreement is observed. This disagreement seems to indicate that the contribution of the frequence band in question to the articulation is greater when it is in co-existence with the intermediate frequency band than while it is isolated. (2) Vowel Articulation : For a transmission system possessing passbands corresponding to the formants of a vowel, the articulation of the vowel obtained does not show much improvement over the articulation of other vowels after passing the same system. For systems corresponding to vowels other than /i/, the articulations of all the vowels obtained after passing each one of the systems are all above 0. 8, whereas the values obtained for the system corresponding to /i/ are all much lower than these values. This seems to indicate that discrimination between all vowels does not depend upon the recognition of the position of formants but on the detection of the difference in frequency spectrum with in a limited frequency band. (3) Confusion between Consonants When the consonants are grouped as (1)k, p, t, s, h, etc. , (2)g, b, d, r, z, etc. , and (3)m, n, w, y, etc. , the confusion between the consonants of the same group is great, whereas the confusion between the consonants belonging th different groups is rare. This seems to indicate that the consonants belonging to each of the groups possess some characteristic property to that group which is not lost by limitation of passing band.
    Download PDF (638K)
  • Tsuneji koshikawa
    Article type: Article
    1958 Volume 14 Issue 2 Pages 164-169
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    In order to measure the telephone transmission quality, the scale of naturalness which differs form that of articulation or intelligibility, was considered. Using this, evaluation was made on the effects of the factors of the transmitting system containing the distortions of filtering and non-linearity. In this report, the measuring scale for the naturalness of a certain talker was determined as the degree of difference of the talker's voice from the undistorted reference state of his original voice. For scaling of such qualities, we used the Thurstone's distance scale which is known in psychometrics. Experiments were performed with the two kinds of the filtering distortion systems, low pass and high pass, and with the two kinds of non-linear distortion, 2nd and 3rd harmonic, systems. The results of these experiments were compared form the view point of the naturalness by means of the common distance scale.
    Download PDF (559K)
  • Tomio Yoshida
    Article type: Article
    1958 Volume 14 Issue 2 Pages 170-174
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    In this report, the scaling of higher qualities of stereophonic sounds are treated. The concept of the quality treated here comprises of seven different characters; vividness, clearness, separation of each sound from numerous sound sources, separation of the signal from the noise, feeling of offensive reverberation, feeling of presence, and feeling of the distance of the sound source. At the beginning, the author made the listeners clarify, unify, and fix the concept of the quality in their psychological domains following careful procedure, and then gave stimulative sounds to them and demanded them to make comparative judgements to the sounds. The stimulative sounds comprised of seven varieties-a 2 channel reproduction, a 1 channel 2 speaker reproduction, and five mixed 2 speaker reproductions. The word "mixed reproduction" here means that the reproduced signals are obtained by artificially making the 2 channels mutually cross-talk by the same amount to one another. The mixing level is defined as the power ratio of the cross-talk signal to the original signal, the mixing levels employed here being 13, 8, 5, 4, and 3 db. "The method of paired comparison" is employed in the experiment, and the results are calculated by Thurstone's method (case III). As a result, the seven scales in each qualities are obtained. These scales indicate the degree to which the stereophonic sounds stood at advantage over the usual 1 channel reproductions by the distance on the psychological scales. For example, on the "vividness scale", we can see that the "2 channel" reproduction is ranked as the most vivid, the "3 db mixing" as the least vivid, with the "1 channel" reproduction intermediate between them. The psychologically evaluated distance from "1 channel" to "2 channel" is about 1. 3. These scales are useful not only in evaluating the qualities of the reproduction systems by the distances between the reproduction systems on each quality scale, but also in obtaining quantitative relation between the qualities by counting the correlation factors between the scales. The details will be given in succeeding reports.
    Download PDF (534K)
  • Masao Onishi
    Article type: Article
    1958 Volume 14 Issue 2 Pages 175-180
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    The present writer insists that, as a fundamental matter, "noise" belongs to the category of "natural sounds" and "voice and phoneme" to the line of "speech sounds", in another words that the former is the concrete, outward, physical phenomena while the latter is the abstractive, inward, psychological image. The main reason of these discrimination comes from:1) the existence or non-existence of "contents", i. e. , "linguistic meaning" in the background of sounds. 2) the establishment or non-establishment of "auditory conventions" which often allows to cause some distortions of sounds or so called phonetic changes. 3) "natural sounds" are, as its nature, universal or international but "speech sounds" are limited to national or individual language. Anyway, there is no existence of "sounds" itself besides the existence of man, or more exactly the existence of ears, in the world. Even the most scientific experiment of acousticians would have to employ the judgement of auditory organs at its last stage.
    Download PDF (648K)
  • [in Japanese], [in Japanese], [in Japanese], [in Japanese], [in Japane ...
    Article type: Article
    1958 Volume 14 Issue 2 Pages 181-205
    Published: June 30, 1958
    Released on J-STAGE: June 02, 2017
    JOURNAL FREE ACCESS
    Download PDF (2625K)
feedback
Top