We propose in this paper a query-by-example spoken term detection (QbE-STD) method for keyword detection from zero-resource language speech databases. The proposed method employs the phonetic posteriorgram (PPG) trained with multiple resource-rich languages and combines multilingual PPGs for speech representation. The keywords are detected using the dynamic time warping method. We examined three types of combination of multiple languages such as concatenation of PPG (PPG_CONC), a combination of language resources to calculate multilingual PPG (PPG_ALL), and multi-task training of PPG using multiple languages (PPG_DIV). We carried out an experiment of the QbE-STD from Kaqchikel speech. As a result, the use of PPG showed better detection performance than the method based on the conventional speech feature (MFCC), and the use of multiple languages gave a further improvement of detection.
In this study, we addressed the challenge of estimating the importance of texts in scene images. Research on text analysis in scene images has focused on detection and recognition; however, estimating its importance has not received much attention. We focused on the possibility that importance can be estimated from visual appearance. Therefore, in this study, we constructed scene image datasets, including texts, and assigned an importance to each text via subjective evaluation. Based on the subjective evaluation, the image features representing importance of text contents were determined, and an importance estimation model is proposed. The results of the evaluation experiment indicate that the proposed method can estimate the importance with a higher accuracy than the existing method.
We conducted happiness surveys for around 26,000 respondents all over Japan except Okinawa on the five periods in Dec. 2019, Sep. 2020, Dec. 2020, Mar. 2021 and Jun. 2021. As the COVID-19 outbreak began in Feb. 2020, the surveys are adequate to evaluate how the pandemic affects our life. In this paper, we report the influences of COVID-19 on subjective well-being for Japanese evaluated based on the surveys before and after the outbreak. We applied a dynamic regression model that describes joint effects of individual and spatial factors to visualize space-time behaviors of Japanese subjective well-being. Namely we quantified the factors of happiness driven by individual factors, which are age, gender, income and so on, and those by spatial factors in prefectural levels after controlling the individual ones. Examining the dynamic changes on the five periods, we find that the COVID-19 outbreak in Japan has damaged the subjective well-being of young females most seriously and the crucial damages still are continuing especially for the low income group.
The sheer quantity of information in the modern world has increased significantly, which exceeds the volume that can be managed using human power. Although information is necessary for decision making, excessive information is not beneficial for proper decision making. Therefore, data mining conducted using machine learning and artificial intelligence (AI)-assisted decision-making systems are increasingly being used in our society. However, problems, such as discriminatory decisions and the promulgation of injustice by AI, have been exposed recently. In response to this, numerous countries and organizations have recently announced a set of AI principles based on the concept of human-centered AI that fosters human values. The principles call for understanding diversity, ensuring fairness, and eliminating discrimination in the use of AI. To implement these values in AI systems, having a philosophical understanding of the structure of injustice in human knowledge production is essential. The problems of injustice and discrimination in knowledge production have recently been categorized as ``epistemic injustice'' in philosophy and epistemology, and the theories explaining these phenomena are becoming more sophisticated. This paper aims to contribute to the understanding of ``human-centered'' AI by connecting the philosophical concept of ``epistemic injustice'' to the discussion of AI ethical principles. It further points out that the issue of injustice and unfairness in AI use is not only a social–ethical as well as an epistemic concern.
With advances in digital technologies, the number of images we are subjected to every day has increased significantly. Predicting and recommending human subjective preferences for images is useful for selecting image data efficiently to avoid the unnecessary use of valuable storage space. In this study, we investigate the use of a machine learning model for estimating human preferences for images from spontaneous facial features extracted from video images of human faces while they are performing a natural preference evaluation task. We use two image categories and compare the results between categories. We also conduct an experiment to assess the performance of human raters in predicting the preferences of others from facial videos. As a standard to compare predictive performance from facial expressions, we also test prediction from high-level image features by training a deep learning model using the obtained experimental data. The results show that the spontaneous facial features produce prediction performance comparable with, and for lunch box images, marginally better than, the image features specifically trained for our dataset, and clearly outperform the human raters. We further examine which facial expression features are important for prediction and show that the important facial features differ between image categories. Our results show that facial expressions can be used to predict the preference for images, to some extent, although we need to be careful when generalizing the learned model to other image categories. Our machine learning approach also provides insights into the differences in the cognitive mechanisms used for preference evaluation for different image categories.
Student feedback is useful for teachers to improve their teaching. Although it is common to receive student ratings in universities, the low frequency of such feedback reduces the utility of the information. Using methods that do not rely on ratings can increase the frequency of feedback. We investigated whether the body posture of students can be used as an indicator of classroom engagement. In this paper, we estimated body posture from videos taken of students in the audience during a presentation and classified the scenes based on the postural similarity. The obtained clusters showed that body posture changed over time and did not return to the original state. A comparison between clusters at the beginning and end of the presentation showed that the standard deviation of head direction becomes large at the end, suggesting that body posture might reflect the degree of distraction. We discussed how body posture information facilitates teachers' reflection.
In the Wealth of Nations, Adam Smith introduced the metaphor of an invisible hand, which alluded to market equilibration. Leon Walras and later Gustav Cassel conjectured the existence of a general equilibrium. Their ``proofs'' were based on counting variables and equations, concluding that a general equilibrium of all markets exists. In the 1930s, Abraham Wald proved the existence of a general equilibrium in Cassel's model. Later, Gerard Debreu proved the existence of a generalized Walrasian equilibrium. Mathematical economists tried in vain to prove the uniqueness and stability of General Equilibrium Theory (GET). Its lack of realism is obvious. Walras acknowledged that GET is metaphysical because it lacks the dimensions of time and space and is thus pre-institutional and lacks trading processes. Mathematicians such as Donald Saari later introduced a dynamic version of GET, while some mathematical economists incorporated space in a dynamic GET. However, a realistic spatial version of GET requires a theory with multiple time scales. This implies that the dynamic processes relevant in markets for goods are too fast for the slow processes of institutional change and network dynamics. We thus reformulate GET as a synergetic theory of fast and slow processes of equilibration and phase transitions.
Mark Granovetter promoted the threshold model of social behavior in which the acceptance value of an action is determined by the proportion of a population that already accepted it. The model is about an individual embracing an idea once a sufficient number of people embrace it. In this paper, we propose a mathematically accurate population dynamics model based on Granovetter's idea for the spread of information in a population. Individual threshold values with respect to the acceptance of a piece of information are distributed throughout the population ranging from low (easily accepts information) to high (hardly accepts). Results from the mathematical analysis on our model show that critical values exist for initial knower population size, mean and variance of threshold values. These critical values are about the drastic difference in the proportion of the population that end up knowing the information, depending on respective features of the population according to the information spread.
Thus far, we have experienced three artificial intelligence (AI) booms. In the third one, we succeeded in developing AI that partially surpassed human capabilities. However, we are yet to develop AI that, like humans, can perform a series of cognitive processes. Consciousness built into devices is called machine consciousness. Related research has been conducted from two perspectives: studying machine consciousness as a tool to elucidate human consciousness and achieving the technological goal of furthering AI research with conscious AI. Herein, we survey the research conducted on machine consciousness from the second perspective. For AI to attain machine consciousness, its implementation must be evaluated. Therefore, we only surveyed attempts to implement consciousness as systems on devices. We collected research results in chronological order and found no breakthroughs that could deliver machine consciousness soon. Moreover, there is no method to evaluate whether an implemented machine consciousness system possesses consciousness, thus making it difficult to confirm the certainty of the implementation. This field of research is a new frontier. It is an exciting field with many discoveries expected in the future.
Lin and Yu verified maximality of some Seidel matrices in 2020 by calculating clique numbers with a computer. In this paper, we show that maximality of these matrices follows by investigating their spectra, without using a computer.