Three-beat sounds are widely used in cheering. Five- or seven-syllable phrases are used in traditional Japanese and Chinese poems. Odd numbers of short sounds seem to relate to an interesting affective phenomenon. Our psychological experiment conducted with various numbers of beats shows that a three-beat sound is the most exciting sound for people. We propose a neural-network model that relates to such a phenomenon. It is realized by the combination of neural networks with reciprocal-inhibition and cascade structures. We additionally take into account the effect of attention fluctuation. We simulate the neural-network operation, indicating that degree of excitement of the model takes the highest level for three beats. This result is confirmed to be consistent with our experimental result. Under the condition that the attention fluctuation is suppressed, the degree of excitement in the simulation takes higher levels for odd numbers of beats than even numbers of beats. This seems to relate to the ground of five- or seven-syllable phrases used in traditional Japanese and Chinese poems.
This study proposes a method for the automatic design of personalized playlists that places a listener in a positive mood (uplifting or relaxing feelings) based on individual’s impressions and pleasurable feelings of audio tracks. This study investigated the changes in a listener’s mood before and after listening to personal impression playlists, and personal impression and personal or common pleasure playlists using a psychological scale. Consequently, the playlists based on personal or common pleasure can shift a person into a positive mood while increasing a person’s friendly emotion.
This study investigated the influence of multiple surface features on the perceived surface softness during rubbing motion. Participants rubbed and ranked 13 types of 3D-printed rigid specimens with different macroscopic shapes and microscopic grains. A regression analysis revealed that the amplitude of the surface sinusoidal shape and the diameter of microscopic grains positively and negatively affected the signal-to-noise ratio calculated by the softness ranks, respectively. Smooth surfaces featuring a sinusoidal macroscopic shape with an amplitude of 0.5 mm felt softer than flat, smooth surfaces. By contrast, surfaces featuring grains with a diameter of 1 mm felt harder than the flat, smooth surfaces. These findings can help design soft-feeling product surfaces.
Our research is to recognize the tea leaves opening stage the Deep Learning image analysis. Since the quality of tea depends on the stage of the growth, it is therefore important to predict the leaf opening period. Relative amounts of amino acid and theanine has significant effect on the quality of tea. High quality plucked tea leaves contain the maximum level of theanine. However, over time theanine changes to catechin an astringent ingredient in the sunlight. This means the content of the “Umami” ingredients is reduced. The hypothesis in this study is Umami’s level changes over time can be predicted by image analysis. Image analysis is performed using the Continuous Wavelet Decomposition (CWD), and the Deep Learning (DCGAN, PCA, SAE, and LSTM) as methods. We combine these in certain order and use them in analysis. The advantage of with combine 5 methods grades “fuzzy” tea photo images, difficult to classify accurately, than with one single method, as spectrum analysis, AKAZE and so on. By developing an iPhone application that feed back the analysis predict the optimal picking time, it can contribute to the tea quality prediction of large tea farm a large-scale.
Appropriate levels of arousal potential evoke hedonic responses. Yanagisawa, the second author, previously proposed a mathematical model of emotional arousal caused by novelty, uncertainty, and complexity using information-theoretic free energy. In this study, we formulate the information processing flow in the brain based on the dynamics of free energy to represent both emotional arousal and pleasure caused by the reduction of free energy. We hypothesized that free-energy reduction is proportional to a positive emotion such as pleasure and that the state of free energy meaning arousal potential forms an inverse U-shaped function of the valence. We verified proposing model prediction using the experimental data given by a previous study of musical pleasure. By using a statistical learning model of music, we calculated the free energy of perceiving the progression of harmonies as sensory stimuli. The results partially reflect our hypotheses and revealed the two: displeasure due to large arousal potential and pleasure due to large free-energy reduction.
Currently, there is a growing interest in rehabilitation and health care in Japan. Against this background, treatment with electrical stimulation that improves ADL (activities of daily living) and QOL (quality of life) has been attracting attention. The home low-frequency treatment devices in use today are based on a combination of signals that relieve pain and signals that reduce the tingling sensation caused by low frequencies. However, the tingling sensation peculiar to low frequency remains. Therefore, in this study, we developed a device to increase the number of stimulus perception points using phantom sensation by electrical stimulation, to realize pseudo-multipoint stimulation with a small number of stimulus points, and to generate a masking effect to reduce unpleasant prickling sensation. In addition, the frequency, waveform, and amplitude that can be generated by existing methods are limited and the degree of freedom is low, so it is important to develop devices that can safely present electrical stimulation with a high degree of freedom. For electrical stimulation, pulse signals generated by the Arduino are boosted using an operational amplifier and a current mirror circuit to ensure sufficient current to flow to the human body, which has high impedance, and are applied from two electrodes. The placement of the electrodes and the intensity of the stimulation should be determined through prior experiments.
Robots designed to support the transportation of goods are expected to play an active role in addressing social problems caused by the declining birthrate and aging population. Conventional mobile robots that support cargo transportation are generally three- or four-wheeled. As a result, most of them are large and difficult to maneuver, and are primarily used in factories and warehouses where the environment is stable. However, they have not been put to practical use in situations where the environment changes rapidly, such as in human living environments. In this study, we demonstrate that an inverted two-wheeled vehicle, which has the advantages of being relatively lightweight, highly maneuverable, and having a small footprint, can move agilely in a complex and confined space by applying human manipulation force, and that it can move in cooperation with humans through force control. A force sensor was mounted on top of an inverted two-wheeled vehicle, and the operator's manipulation force was applied to the vehicle. In response to the applied manipulation force, force control is performed using admittance control with a variable viscosity coefficient, and the inverted two-wheeled vehicle moves in a coordinated manner to minimize the manipulation force. With the introduction of this system, the operator can transport a load with a minimum force, even in a complex environment.
The key parameters in the brain-computer interface (BCI) are input speed, accuracy, ease of use, and the number of inputs. Steady state visual evoked potential (SSVEP)-BCIs, which are excellent in the previous three categories, have problems with the number of inputs. We designed a 50-selective SSVEP-BCI to increase the number of inputs to realize Japanese and PC keyboard input in the future. To increase the number of inputs, we improved the frequency resolution. The number of inputs can be doubled by changing the resolution of the stimuli from 0.2 to 0.1Hz. This is because it is possible to double the number of inputs. We conducted canonical correlation analysis on the subject’s raw and pseudo-signal data. The noise is extremely large, and the conventional analysis method that outputs the maximum value of the canonical correlation vector has a low positive response rate. Thus, we ran a frequency band restriction that discriminates SSVEP components with a threshold by frequency. We also introduced a majority voting algorithm to eliminate the nontypeable data. Consequently, the average correct response rate was 55.11%, and the maximum was 79.53%; the average information transfer rate was 28.05 bits/min, and the maximum was 45.16 bits/min. Therefore, the experimental results show that the number of inputs can increase with improved frequency resolution.
Early diagnosis is important for the treatment of dementia, but people with dementia are often reluctant to see a doctor. In our laboratory, we are developing an inexpensive and easy screening tool for cognitive function using the P300-based Spelling Brain- Computer Interface (Spelling-BCI). By creating an attention-focused plot using discrimination scores and comparing the results, we hope to confirm the differences among patients with dementia and the value of screening for dementia. BCI estimates characters using discrimination scores calculated based on the P300 component for rows and columns. Therefore, we developed an attention-focused plot based on the score for each character. As a result, NC was displayed whiter and blacker as it became MCI/AD. NC was estimated to be 83.3%, MCI was estimated to be 63.0%, and AD was estimated to be 33.3% of the total number of characters. In addition, there was a tendency for AD to respond in addition to the rows and columns containing the target character. These results are thought to be due to a decrease in attention concentration caused by a decrease in cognitive function. Therefore, it was suggested that the difference among patients with dementia could be confirmed from the attention concentration plot, which is useful for dementia screening.
In this study, we hypothesized that there is a relationship between the success rate of motor imagery and coherence. Further, we conducted experiments, and analyzed the results. In the experiment, the motor imagery part was performed 40 times for two motions after training 60 times for three motions. In the motor imagery part, EEG: electroencephalogram was as measured before and during motor imagery. Coherence analysis was performed on the combination of the measuring electrodes, and the coherence before and during motor imagery was compared; a straight line was drawn between the significantly reduced electrode pairs. When comparing the strong and weak groups with success rates of over 50% and under 50%, respectively, the coherence of the entire brain of the strong group was confirmed to decrease because of motor imagery. Additionally, when the relationship between the success rate of motor imagery and the number of electrodes with significantly reduced coherence was plotted on a scatter plot, the graph was upward-sloping when recalling the right arm, and a strong positive correlation was observed. Further, when the left arm was visualized, the graph was upward-sloping, and a positive correlation was observed. This indicates a relationship between the success of motor imagery and the decrease in coherence during motor imagery of the entire brain.
Although verbal information is adequate to recognize high-level psychological states, it is not compatible with passive and continuous monitoring systems. In this work, emotion-reacting wear using a passive and continuous emotion recognition system operating in real time is proposed. Specifically, a basic design of a wearable system, whose core function is to estimate emotions by analyzing facial expressions obtained from facial images, is proposed. Based on the “retro-futurism” concept, 3D prototyping is used to determine detailed fashion design components. The design and implementation of hardware and software modules that provide audiovisual stimuli based on emotions are described, and a prototype is developed. An extended collar integrated with a digital camera module is made of a transparent polycarbonate material. In this way, facial image capturing, which can be used for facial expression analysis, can be achieved. In the emotion-reacting wear, methods for audiovisual stimuli based on an illuminating pattern generator and a search keyword generator are also discussed.
Kawaii is a Japanese cultural uniqueness that attracts attention around the world. In this study, we focused on pink as a typical kawaii color. Similar to our previous study in 2020, we selected four pink colors and used in our questionnaire to collect data about most kawaii and most favorite pink colors as well as the behavior in using pink products including clothing and makeup. We compared the questionnaire results between 2020 and 2021 to clarify similarities and differences between these two years. As the results, we clarified the pink color that have constant trend over time and the one that tends to be influenced by annual trend. The results suggest that different pink colors may give different impressions according to the annual trend which contributes to the fashion industry.
Although 2D computer pattern making and 3D virtual prototyping are skills regarded as essential for the next generation of fashion experts, a concrete educational method for 3D virtual prototyping has yet to be established because of its high technical hurdle. This paper sought to develop an effective and creative educational program by examining the learning challenges surrounding 2D computer pattern making and 3D virtual prototyping based on a quantitative analysis of questionnaire surveys. Results showed the importance of scientific factors underlying garment design, which consist of the mathematical aspects of garment geometry and physical aspects of materials. Educators aiming to construct effective and creative educational programs for fashion talents must emphasize scientific backgrounds in apparel design. The key factor in these educational programs that would nurture creative talents in the fashion industry and academics is an intuitive and experimental method for teaching scientific factors fundamental to apparel design.
With the explosive growth of online discussions nowadays, fostering interesting and satisfying group discussions for all group members has become challenging. In this work, we particularly seek to address the issue of online group formation where diverse participants with various topic interest levels gather and carry-on open-ended synchronous discussions in small groups. In these groups, members often encounter various difficulties, especially when their degree of interest in the discussed topic decreases drastically. Our proposed method is a boids-model inspired algorithm that captures group discussion dynamics in terms of the evolution of discussed topics over time, and variations in group members’ degree of interest for the discussed topics. Discussion topics are modeled as multidimensional vectors where dimensions correspond to factors that are associated with group members’ interest vectors. In this paper, we present the proposed method and discuss its potential for achieving dynamic detection of left-out members. We also present our next steps towards investigating the meaningfulness of our approach through more complex simulations.
This study aimed to solve the problems that speakers feel in group discussion that take place in learning situation. It examined the relationship between factors such as frequency of speaking out, direction of conversation, and length of silence as well as speakers' evaluations. 3 group discussions were analyzed, and speakers were asked to evaluate the conversations. The results showed that: (1) the frequency of speaking out and the direction of the conversation can be combined to more accurately determine how lively the conversation. (2) the number of times the conversation goes silent roughly takes into account the turn-taking between the speakers and can be used together with other elements such as how much each speaker talks, the direction of the conversation, and more, to aid in predicting the balance of opportunity to speak out.
In this paper we studied the influence of the combination of linear velocity and rotational velocity on vection and visually induced motion sickness. The experiment was performed with subjects wearing a head-mounted display. While the stimuli videos were shown, strength of vection, the time in which vection was experienced and the level of motion sickness were evaluated on 5-level scale by subjects. We found that as linear velocity increases, vection induced by rotational motion decreases. Furthermore, in the presence of rotational motion, the sense of vection experienced was greater as the rotational velocity increased at low linear velocities, accompanied by a longer duration of vection and a higher level of motion sickness. In contrast, it is possible to make the level of motion sickness the same as a video without rotational motion by limiting the rotational velocity to 20°/s, and it is possible to suppress motion sickness to an extent without impairing immersion by limiting rotational velocity to within a certain range.
Today, when even people lacking experience in keeping animals have increased opportunities to interact with animals, there is a need for laying a foundation for better relationships with animals. Accordingly, to improve the mutual relationship that exists between dogs and humans, this study conducted an experiment of distinguishing dog emotions, using visual information obtained with an eye tracker. It was found that a high rate of correct answers (80% or higher) was obtained for all 4 questions asked, that subjects watched facial expressions for a longer duration than the tail and looked at facial expressions more frequently. Frequency of gazing away from the face and at the ears or tail was low, and duration was short. However, as the ears and tail remain within the field of vision, it is thought that they are taken to be auxiliary elements for distinguishing emotions. From the results of a questionnai re concerning dogs, little difference was found in line of sight and rate of correct answers attributable to degree of interest in dogs and differences in experience.
This paper applies a method of using a paint-based graphical user interface (GUI) in the multimedia retrieval system based on the impression to picture retrieval based on impression. As this GUI enables users to specify the weights of impression words, converting weights is introduced. This paper shows the possibility of invoking picture search engines by using this conversion.
Several educational institutions have opted to implement distance learning such as online classes because of COVID-19. However, students turn off their camera for privacy reasons when they are online. Non-verbal communication is difficult when the camera is turned off; however, it can be conducted if avatars are used substitutes. It is crucial to combine multiple nonverbal languages for nonverbal communication. In this study, we attempted to communicate drowsiness using just the eye closure rate, and we asked the subjects to observe its expression on the avatars to verify whether it was communicated. The results indicated that drowsiness could be transmitted using single nonverbal information. Multiple nonverbal information should be combined for more accurate transmission. Furthermore, it must be verified in an online situation.
In Japan, there has been a recent increase in the incidence of mental illness. Designation as a “highly sensitive person (HSP)”. HSP is not a diagnosis of mental illness, but refers to individual temperament. However, this cluster of traits shares characteristics with both attention-deficit/hyperactivity disorder and generalized anxiety disorder. The central feature of the HSP is a high level of empathy. Evaluation of one’s HSP status is by self-report on a psychological questionnaire but scores on such measures can be inaccurate due to their dependence on the self-awareness of the test-taker. Therefore, in this study, empathy was evaluated through the measurement of emotional contagion and mirror system activity using an electroencephalograph. The results were compared to participants’ scores on the Highly Sensitive Person Scale (HSPS). We found that participants with an HSPS score of 100 or higher showed event-related desynchronization (ERD) of 50% or higher, indicative of mirror system activity. In addition, participants with an HSPS score of 100 or higher exhibited lower alpha wave band power values when presented with an image of a happy face. Since alpha waves are associated with relaxed states of non-arousal, it can be inferred that the happy face induced happy feelings, increasing arousal, and reducing alpha rhythms. Thus, it was found that the higher the HSPS score, the greater the level of mirror system activity and emotional contagion.
The latent value of food has a great impact on customers’ feelings in purchasing and eating food. Our previous study revealed that the latent value of curry roux for home use induces an empathetic bond between people. When eating together, people need to communicate during the meal and complex operations are required. According to a recent study which investigated eating behavior and compared eating alone to eating together, the author explains that the reason of the gathering movement is to reduce uneasiness when eating together. We previously revealed that stirring and gathering movements (hereinafter referred to as Movements) induce comfort and an empathetic bond between people when eating curry together. To reveal the universality of this effect, we experimented on spaghetti, another dish requiring Movements. We conducted paired experiments (N=12) in two different eating styles; with and without Movements. Comfort was measured by biometric information, electrocardiogram, and by Kansei evaluation, Curve Drawing Method. Fork synchronization was also measured as an indicator of empathy. There were no differences in HF, LF/HF ratio, surface areas of the Curve Drawings and rate of fork synchronization between when using Movements and not using Movements. Movements when eating spaghetti did not have impact on comfort and empathy, in contrast to using Movements with curry. The interesting effect caused by Movements with curry may partially explain why it is eaten repeatedly as a national food.
Food mileage simply means "total amount of food transported/distance". Japan is now facing food mileage problem because food mileage in the country is significantly higher than that of other countries. In this article, we explained the reason behind the high food mileage in Japan and how can it be reduced. In addition, we explained the kind of efforts that are being made to distribute domestic foods throughout Japan without relying solely on imported products. We looked at Japanese food from several perspectives and considered the results of the literature. Finally, we hope to try to change the mind from cost-oriented consumption to fashionable consumption.
In this study, we surveyed consumer evaluations of the taste and functionality of a product containing the tomato Shonan Pomoron–which is rich in lycopene–from the Kanagawa prefecture in Japan. This is an initiative to register foods with functional claims (kinou sei hyoji shokuhin). The purpose is to use the insights gained into consumer sensitivity and behavior to explore future possibilities for labeling and display of products with health benefits. The investigation indicates that the perceived effect of lycopene on cholesterol differs depending on gender and we propose changing the nutritional display depending on the target market. We hope examining the labeling method is an important means of improving the welfare of consumers.
DX stands for digital experience meaning a digitally structured experience. Recently, the use of DX has become particularly widespread. In order to improve the richness of DX, we consider visual-olfactory integration in perception by adding olfactory stimuli to DX. We formulate a Bayesian perceptual model to explain the DX richness. The model predicts that olfactory stimulation and the degree of activeness in user’s manipulation of an avatar enhance both the sense of presence and the sense of existence as components of richness. To verify the model prediction, we conducted an experiment to obtain participants’ responses regarding the DX richness to all combinations of the olfactory presentation and the kinds of activeness of the controller manipulation in a virtual reality environment. The experimental results show significant effects of both olfactory presentation and the activeness on both the sense of presence and the sense of existence, supporting the model predictions. The result suggests that the olfactory stimulation enhances the richness of DX.
Emotion modeling has been a significant task in engineering design to optimize the users’ feedback. Psychology studies have established explanatory relation of agent’s outcome emotion with fluency from perceived stimuli. Friston proposed the free-energy principle modeling the human perception process in a Bayesian form to apply the emotion models in engineering. Joffily further suggested that free-energy reduction is encouraged by positive emotion generated. This study proposes a hierarchical Bayesian model that mathematically explains emotion feedback in the perception process. To support the model, our study further linked the free-energy paradigm with fluency and informatics. We suggest a successful category recognition would cause free energy reduction and thus positive emotion valence. The informatics approach based on efficient encoding provides a practical simulation of the human perception process that helps to generate proper visual stimuli for experimental evidence. Two types of stimuli are introduced as evaluation material to compare with their non-category transformation that correspondingly shares the same feature. The result shows that category recognition can reduce the sense of novelty as part of free-energy. The emotion valence is positively correlated with the novelty reduced through the recognition, where a moderate level of novelty reduction can be more effective. In the perspective of engineering, the design features promoting category recognition can contribute to the optimization of the users’ emotion feedback.
We developed a method for three-dimensional (3D) garment modeling with sleeves using the contours of garment images with the aim to apply it to garment patternmaking. We made models of a jacket bodice and sleeve by deforming each reference 3D model using the contours of jacket images from front and side views. After adjusting the positions of the 3D models of the bodice and sleeve, we superimposed the models and then created a jacket model with a sleeve. We obtained the armhole from the intersection between the surfaces of the bodice and sleeve models. We successfully made a 3D garment model and its armhole using the proposed method. We also obtained bodice and sleeve patterns that reflected the armhole. The simulated patterns exhibited a curved shape along the arm.
Communication robots such as partner or pet robots are machines that use artificial intelligence to communicate with humans. These robots need to behave naturally and convey a friendly impression to be a healing companion. Human imperfections such as making poor judgements, forgetting, and dropping things are at the heart of our human identity. Hence, to convey a more friendly impression, robots must conceivably display some believable imperfections. Therefore, this study aims to develop robots that make mistakes to convey a humanlike or friendly impression. Furthermore, this paper implements an emotional-state-driven decision model to reduce the bad impression effects that deliberate mistakes might convey, making it possible for human partners to relate robots’ mistakes to their emotional state. This is expected to let humans interpret the causes and tolerate such mistakes by associating to robots’ mistakes with a certain form of humanlike behavior. Moreover, this developed an interactive environment of humans and robots for a simple volleyball game to investigate the meaningfulness of our approach. Thereafter, an experimental evaluation was conducted in which participants interacted with two types of robots, perfect and imperfect. The experimental finding show that the imperfect robot displays some mistakes but conveys a good impression, demonstrating the potential of the proposed approach toward achieving stronger long-term social interaction between humans and robots.
People’s affective states when watching sports videos can be related to their physiological responses. This study proposes a method for extracting sports video highlights from viewers’ physiological data using fuzzy logic. Fuzzy logic is a multivalued logic, which is similar to human thinking and interpretation, and it has been widely applied to emotion recognition. An experimental evaluation of the proposed method was conducted in a soccer game viewing context. Viewers’ heart rate and galvanic skin response while watching a soccer game were collected and processed following fuzzy rules. Viewers were also asked to indicate time intervals when they felt excited during the game. To improve the accuracy, we use a genetic algorithm to optimize the fuzzy rules. The results of our fuzzy-logic-based approach with viewers’ labels showed the importance of our proposed method.
We propose a music recommendation system based on the Kansei retrieval agent (KaRA) model using fuzzy inference to present music data that a user satisfies to obtain user preference rules for music. The KaRA model learns the user-preference using user evaluation information based on the fuzzy inference parameters of each agent that is optimized by a genetic algorithm, and it searches user-preference objects from a database. A previous study has proposed the KaRA model using fuzzy inference and applied a character coordination system as a prototype system for obtaining user-preference rules. However, we did not apply the KaRA model for music retrieval. In this study, we used the KaRA model with fuzzy reasoning to retrieve user-preference music data based on the following musical score features: main tune, tempo, the number of instrumentals, and beat, among others. Further, we examined the effectiveness of the proposed system from the viewpoint of obtaining user preference rules for music. From the experimental results, the proposed system can obtain the user preference rules as fuzzy rules of the KaRA model and recommend user preferred music data.
This study investigates the influence of ambiguous sun and moon images on stimuli perception based on pupillometry. A random stimulus was presented in a few seconds, and another few seconds, as feedback, observers reported the stimulus was perceived as the moon or the sun. To overcome the lack of previous studies that have not been able to segregate the physical (Glare effect) and cognition factors of image stimulus, the data were grouped into two categories, i.e., as the actual image (the ground truth, “GT”) and observers’ perception (“PR”) responses. As a result, the pupil constricted significantly when the stimulus is perceived as the sun. Furthermore, this pupillary response is unassociated with the average physical luminance of images. This result indicates that high- level cognition influences perception pupillary response.
Fluctuations in nonlinear systems can enhance the synchronization with weak input signals. Chaotic resonance (CR) is one of such phenomena, caused by a system-intrinsic chaotic fluctuation. CR is observed in systems with chaos-chaos intermittency (CCI), where a chaotic orbit appears between separate regions. Based on the characteristics of CR, we previously proposed a novel method for controlling the chaotic state to an appropriate CR state by adopting a feedback signal from the system itself. This method is called the reduced-region-of-orbit (RRO) feedback method. The RRO feedback method was applied to discrete and continuous time chaotic systems, and its versatility was confirmed. Moreover, we have applied this method in an intervention to facilitate the transition from the disturbed circadian rhythm underlying the bipolar disorder to healthy periodic activity, based on a neural system model of the frontal and sensory cortical areas proposed by Hadaeghi et al. In this study, we further examined the responsiveness of CCI to a weak periodic signal by extending the parameter regions of the system. As a result, we confirmed the effectiveness of the RRO feedback method for stabilizing the neural activity observed in the bipolar-disorder model over a wide range of parameters.
Numerous studies have been conducted to determine the factors that contribute to facial attractiveness. In recent years, interest in using deep learning to predict facial attractiveness and extract features that are important for such a prediction has increased. In this study, the face attractiveness prediction model visualizes features that are important for prediction via two methods, i.e., gradient-weighted class activation mapping (Grad-CAM) and guided Grad-CAM, and then the results are compared. The results show that Grad-CAM visualizes primarily the feature space that is important for attractiveness prediction, whereas guided Grad-CAM visualizes more detailed information such as contours. These methods may facilitate the understanding of facial attractiveness factors.
In general, learning English is difficult for non-native speakers because of the differences in vowels and consonants. There are some ways to practice English pronunciation such as shadowing, however, if the audio’s voice features greatly differ from the learner’s voice, it might impede learning and sound reproduction. In order to solve this problem, we propose a method to make the pronunciation data of the model pronunciation resemble the learner’s own voice by using UTAU and Interactive Evolutionary Computation. As a result of the experiments, we found that this method was capable of searching for highly evaluated solutions. The Wilcoxon signed-rank test was used to examine the statistical difference between the evaluations of the initial and final generations, and a significant difference was observed at P<0.01. Regarding to the pitch parameters, we could find different tendencies between males and females. This means the parameters were actually making the voice similar to examinee’s voice. However, there were some problems, such as the parameters that did not work well, the UTAU voice quality, the lack of female examinees, and so on. We plan to eliminate or at least reduce the effects from those problems in future experiments and make a better system for English learners so that they can learn more efficiently.
These years, computer vision technology has been rapidly advanced. On the other hand, customer satisfaction is essential for the industry to improve its service but use traditional methods ineffectively. Using technology such as computer vision, we can now collect the information we are looking for directly from a human. Humans can use many kinds of modalities to interact with computers. Hands are perhaps the largest source of body language information after the face. To understand the gesture's meaning, we can use MediaPipe Hands developed at Google LLC as a method to track and recognize human hands. However, if we want to understand some kinds of hand gestures using MediaPipe Hands, we need to create a condition ourselves by using if-else conditions. This research tried to collect the many varieties of each hand gesture using the 21 key points in x, y, and z coordinates as a feature. As classifiers, we chose the support vector machine (SVM) and the artificial neural network (ANN). This research found that SVM using a polynomial kernel is the best among all of the methods we used as a classifier method for the 3D value of 21 key points from the hand skeleton. The accuracy and F1-score from SVM using a polynomial kernel were 86.26% accuracy and 82% F1-score, respectively, representing the best performance for each class of all the methods we used in this research.
Today, the population is aging not only in Japan but also on a global scale. Along with this, the number of people who need long- term care is increasing. Nurses need skills as caregivers, and in the future, the time will come when many people not limited to nurses will also need those skills. In the current textbooks for nurses, the movements that should be conscious of in nursing care movements are indicated by arrows. However, these arrows are drawn by the intuition and experience of the textbook author and are not based on kinematic data. In this study, we develop an interface for visualizing kinematic data focusing on caregiver’s movement. For this purpose, we develop a system calculating centers of gravity of three body segments, trunk, right foot and left foot. Three-dimensional coordinates and trajectories of centers of gravity are shown on front, side and top view planes. The interface makes the movement of a skilled caregiver visually easy to understand.