The basic factors of stereophonic sound which contribute to improve the higher qualities were analysed by means of factor analysis as follows. (1)The higher qualities of stereo, for example vividness, clearness, definition, richness etc. were scaled in previous reports. (2)Coefficients of correlation between these scales calculated by a special method. (3)These coefficients represented in matrix. (4)This matrix analysed by factor analysis. (5)As a result, we can see the fact that the basic factor of stereophonic sound is the reproduction of the character of space. (6)Furthermore, it separated by two factors, the first the character of orientation or selectivity, and second the character of inhibition of offensive reverberation. These two factors mean commonly our ordinary hearing experience in the hearing space. And therefore, the cause of superior qualities in stereophonic sound understood as follows; the sound of stereo give us the natural hearing, in other words, it means the matching in a true meaning to our hearing.
Three new reverberation chambers for sound transmission measurements were constructed beside the reverberation chambers for sound absorption measurements constructed four years ago. Airborne-sound transmission is measured through the 3. 0m×3. 0m opening on the wall between two larger chambers (each volume 164m^^3). The shape of the chamber is similar to that of the reverberation chamber for sound absorption measurement, but the size is reduced into 2/3. For impact-sound measurement, the 2. 0m×2. 0m opening on the floor is used. Under this opening the third chamber (irregular hectahedron, 68m^^3) is built. In this report, the outline of the equipments and their fundamental nature concerning the measurements are mentioned. We studied practical subjects such as the sound pressure distributions of source and receiving rooms, the size and the mounting conditions of samples, the effect of absorption power in the receiving room, and others. Then we are convinced that the equipments are useful for our future study of sound-insulating structures.
Transmission loss of single Partitions were measured in reverberant sound field for common building materials, for example plywood, glass, concrete block, etc. In arranging the transmission loss curves, mf, that means (mass per unit area)×(frequency), was taken as abscissa instead of f. Such an arrangement has clarified the relation between the measured transmission loss and the mass law. That is common to those homogeneous materials and consists of three regions as to mf. When mf<10^4kg/m^2・c/s, measured transmission loss is larger than the mass law by several decibels. 10^4<mf<10^5, transmission loss decreases on account of "coincidence effect". 10^5<mf, transmission loss curves recover from their minima, and agree with the mass law. As regards the coincidence effect, for the homogeneous materials, the agreement was found between the frequency at which the minimum of transmission loss occurs and Cremer's critical frequency f_c. In our study f_c was calculated by the static Young's modulus of material. Several kinds of sandwich structure panels have been treated in the same way as above-mentioned. The result of our study shows that the transmission loss of single partitions, either homogeneous or inhomogeneous plates, is roughly predicted by the mass law. In detail, however, the transmission loss characteristics are different depending on the kind of structures. The difference in dynamical property between structures can be pointed as one of the causes of them.
As one of the method to extract pitches from commercial telephone line, the improvement in Halsey's pitch extractor shows preferable results. This paper reveales some experimental results under the condition of no artificial exchange for both male and female voice in using #4 telephone sets. The principle is same as the Halsey's one, but the main feature of our method is to use electrical filters more than three for the purpose. Four low pass filters are used in our laboratory, and the total pitch extraction band width is 80 to 380 cps.
An instrument has been developed as a part of a formant-vocoder for smoothly detecting wide-range voice pitch frequencies. Speech signals from a conventional telephone are cross-modulated, low-passed, frequency-inverted, and frequency-analyzed. Analyzed signals are discriminated separately, and their DC output voltages are compared in a simple diode adder in order to select the most significant voltage which indicates the pitch frequency. The voltage-versus-frequency characteristic is closely linear, and the response time is comparatively small. The temperature characteristic, however, is not satisfactorily good yet.
A new principle is described which permits not only extraction of pitch frequency, but extraction of formant or new parameter of speech information. This is achieved by discriminating the zerocrossing of frequency shifted signal. We shall describe here a pitch extractor which has following remarkable features, 1) with minimum time delay 2) sensitive for weak signal 3) needless of lower frequency component.
Usual pitch extractors are designed for the use of vocoder and have relatively large time constant. But it is thought necessary to convert pitch frequency into analog voltage with small time constant for the sake of the analysis or the recognition of speech. The pitch extractor having small time const. (10m sec. ) was designed and built, namely, pitch is detected from three formant envelopes, and the discriminator changes it into voltage after SSB modulation and amplitude clipping.
A new method of pitch extraction using a digital computer, which is faster than the existing methods, has been proposed. "Locally almost periodic function" is adopted as the mathematical model for voiced speech wave, and the pitch frequency is defined as its "local peridicity". The existence and non-existence of the pitch, i. e. , "local periodicity", and its transition are determined by evaluating the newly defined measure of distance in time domain which is invariant under amplitude and other transformations. Possible application of this method might be for automatic speech recognition, a computer simulated vocoder system and general speech researches. This is a part of research efforts made under the Laboratory's research project "LOGOS".
In the ordinary pitch indicator it is impossible to find in the melodic curve the exact location corresponding to each spoken syllable. An improvement in this respect is achieved by the new pitch indicator which combines pitch and zero-crossing interval indications. The relation between voiced and voiceless, and the transition of tone quality are indicated by this combination. In this system, the indications of both quantities are continuous and inertialess, and if undesired noises are sufficiently suppressed, very weak signals can be distinctly traced. This system may be regarded as a special type of visible speech which is the complement of the sound spectrograph to be applied therewith.
Measurements were given on the time variation of fundamental frequencies of speech sounds of five Japanese isolated vowels spoken by two males and two females. This characteristic seemed to be one of the important factors having an affect on the naturalness of speech sounds. From the measurements and the analysis of data, it is found that the patterns of the variation have always convex forms, that the upward going parts of the patterns have not significant difference among speakers, and that the variation has the characteristics of having remarkable various patterns in the conditions of speaking especially in speaking effort. The spectra of measured curves were determined. Suppose that the mean fundamental frequency is frequency-modulated with the components of above spectra, the spectra of this F. M. wave and the bandwidth covering the 99% of the total spectrum energy are able to be computed. The spectrum envelope may be approximated to have the normal distribution curve, and the bandwidth is found to be strongly correlated to the mean fundamental frequency, and its values are determined to have 20 c/s-50 c/s with male voice and 60 c/s-90 c/s with female one.
More reliable pitch of the voice can be extracted easily from the vibration of the outer skin of the trachea than from speech sounds. To pick up the vibration of the outer skin of the trachea, the microphone (MR 103) of about 25mm in diameter was used with the rubber adapter ring around it, providing about 5mm thick air interspace between the microphone and the skin and bringing its circumference contact with the skin. The waveform picked up by this microphone is much simple and apparent of the periodicity of the pitch period, besides, it is affected little by the change of the resonance property of the vocal tract and the condition of the pronounciation. The pitch extracted by this method was proved to be superior to the one extracted from the voiced sound by misextracting detection and hearing set with vocoder. This method is promissing for the basic study on the pitch in speech, and it will also be useful as the standard of the extraction of the pitch for the trial to improve the pitch extractor from the speech sounds.