It has become very popular to play musical instruments in Japan recently. A considerable number of learners practice the keyed instruments such as piano or organ. Training methods however, need improving and busy teachers have no time to inform learners of the detailed faults of their fingering. Therefore, the development of a teaching machine which can help the teachers would be extremely useful in music education. From this point of view, this research is concerned with the fuilding of a system utilizing a computer that gives feedbacks to learners' playing keyed instruments so that they may recognize the characteristics of their fingering easily. One of the feedbacks is the display of the length, the volume and the pitch of each tone on an X-Y plotter, and the other is concerned with general characteristics. Consequently, the repertoire is essentially limited to etude. There are three kinds of methods in the system (see Fig. 1-(a), (b)-A and (b)-B) and the characteristics of each are summarized in Table 1. In our research, method II-A was chosen for the following two reasons. Firstly, it is developed for learners who are unable to go to the computer center. Secondly, method II-B has no facility for the accurate recognition of tone length. As the first step of the research, this paper presents the system which recognizes a series of organ monotones and displays them on an X-Y plotter (see Fig. 5). In this case, each neighboring tone is separated (see Fig. 3). In section 2, the characteristics is organ tone are described (see Fig. 2), and in section 3, detailed procedures of the recognition are given. Section 4 deals with the characteristics of the learners' fingering and states that the standard deviation of the tone lengths over a short period is an efficient factor characterization of the fingering. Moreover, we asked musicians to what extent fingering is acceptable to them in terms of the deviation, and from their answers, we decided the thresholds (see Table 3).
This paper describes a speech analysis method for linear predictor coefficients and formant frequencies and bandwidths estimation using a portion of one pitch period of speech waveform. This method is based on the fact that estimation errors together with prediction errors vanish when excitation-free segments are utilized (Eq. 6). In the case of actual speech analysis, however, prediction errors may remain, even if they are minimum (Fig. 1). Therefore, it is necessary to formulate estimation errors in order to evaluate the performance of the proposed method. Theoretical studies on estimation errors are carried out. Estimation errors in linear predictor coefficients are dependent on the unknown component of excitation (see Eqs. 7 through 11). Expected values of estimation errors are derived, based on the assumption of the statistical property of excitation (see Eqs. 13 through 15) . A similar expression of estimation errors in formant frequencies and bandwidths is derived in the same manner (see Eqs. 16 through 18). Finally, two kinds of error estimates represented in terms of observed values are introduced as criteria for the determination of analysis conditions of our method and for the experimental comparison between the proposed method and the usual linear predictive analysis (Eq. 19). Simulation studies on these error estimates are performed using several kinds of synthetic speech. Their formant frequencies and bandwidths are given in Table 1. The relevance of these error estimates is indicated in the case of our method and in the case of usual linear predictive analysis (Figs. 2 through 5 and Tables 2 and 3). The length of the analysis segment must be properly chosen when these error estimates are employed in the error evaluation of the usual linear predictive analysis of periodic speech (Eq. 20 and Fig. 6). The results of simulation studies indicate both adequacy and validity of these error estimates. Experimental studies have been done to evaluate the performance of our method in the case of actual speech analysis. The minimum values of these error estimates of our method are as small as one half to one eighth of those of usual linear predictive analysis (Figs. 7 through 9 and Tables 4 and 5). In addition, the glottal source waveform estimated by using the result of our method seems more plausible than that obtained by using the result of the usual linear predictive analysis (Fig. 10). The results of these studies indicate that the proposed method yields more accurate estimates of parameters than those obtained by the usual linear predictive analysis, especially in the case of low-pitched speech.
The sound intensities in a rectangular room are derived theoretically by using the image method. In this method, the image sources are divided into three groups, viz. , each axis, each plane and an oblique directional source, from which the sound intensities for the respective image groups are calculated. In grouping image sources, the parallax j which is defined by Eqs. (28) and (29) is introduced. It is shown that the results coincide with the wave theory, and the hybrid equation expressed by Eq. (43), which is applicable for ordinary rooms, is presented. For practical application, sound decay curves in a cubic chamber and reverberation times in a highway tunnel are calculated.