In this paper, we make two proposals. The first aims to accelerate similarity calculations by only using a subset of the rating information (namely the highest ratings), while the second attempts to improve the accuracy of listwise collaborative filtering using a simple missing value estimation process. Experiments using the MovieLens 1M (6,040 users, 3,952 items and 1,000,209 ratings), 10M (71,567 users, 10,681 items and 10,000,054 ratings) and Jester (48,483 users, 100 items and 3,519,448 ratings) datasets demonstrate that these proposals can considerably reduce the computation time (by a factor of up to 50) and improve the normalized discounted cumulative gain value by up to 0.02 compared with ListCF, a well-known listwise collaborative filtering algorithm.
In recent years, there have been various systems developed to encourage people to engage in walking related activities. In this paper, we propose one such system and outline the concept for how a smart walking navigation system could be developed based on the notion of perceived exertion. Our system utilizes information about the geographical characteristics of the route along with data from social media to recommend a route for the user. In addition, we describe the results of 3 feasibility experiments carried out on the recommended routes by subjects studies, which overall support our proposed concept.
It is not well studied how users’ experience and interaction in social networking sites influence personality traits. In this paper, we examined Twitter users to clarify the relationship between feedback from other users and changes of personality traits, defined as Five-Factor Model: neuroticism, extraversion, openness, conscientiousness, and agreeableness. Firstly, using an API produced by IBM, we estimated users’ personality traits at the past and at the present, and verified how much personality traits are changed from the past to the present. Next, we obtained data related to feedback actions such as replies, retweets, and Likes users received from others. We conducted nonlinear multi-regression analysis to find direct effects of feedback on changes of personality traits. Primary results demonstrated that neuroticism increases when users often receive retweets but decrease when receiving Likes, and that agreeableness decreases when receiving more Likes but increases when receiving more replies. Finally, we discussed implications, limitations, and future work.
As described in this paper, we propose a new analogical description for cooking and a method to generate it automatically. The analogical description consists of a similar cooking and differences between those cooking, such as ‘Okonomiyaki = Korean pancake - leek + cabbage’. To extract a similar cooking and differences for the description, we analyze recipe data of cooking sites. In this paper, we proposed two types of feature vector to calculate the similarity. We specifically examine six elements as differences of cooking and its similar cooking (e.g., foods and cooking-related verbs). We demonstrate that our proposed description is expressible in subject experiment.
In recent years, the expectations for data exchange and use that cross multiple fields have been rising. However, creating a data-driven innovation by coordinating data across different fields first requires a correct understanding of existing data structures and relationships. It thus is important to investigate the structural characteristics of data ensembles rather than analyzing individual data. Data Jacket (DJ) is a framework for describing an overview of data while keeping data itself confidential. This paper utilizes DJs to quantitatively assess overall data trends and characteristics and to understand the structure and system of data, their variables, and sharing policy of data. Results of the analysis revealed the network of data is a network with local proximity and a loose global network. Moreover, public data and private data in the data market have different variables and characteristics in the network.
In this paper, as part of application of text mining in companies, we propose a method that extracts new relevant companies by using common elements estimated from multiple customer companies. For example, if the multiple customer companies are “Canon”, “Epson” and “Brother Industries”, our method extracts “printer” and “inkjet” as common elements. Then, our method extracts “Ricoh” and “Roland DG” as new relevant companies by using these common elements. Our method estimates the common elements based on important words extracted from PDF files of the summary of financial statements of companies. Then, our method extracts new relevant companies by using the common elements. Furthermore, our method classifies extracted new relevant companies as company directly related the common elements or company indirectly related the common elements.
In the domain of tourism navigation, tourist spots as well as route visiting them are important information to be presented to tourists. However, it is difficult to determine tourist spots to visit and the route all at once. While an existing study has tried to solve such a tourism navigation problem by extending Traveling Salesman Problem (TSP) with introduction of weight for nodes (spots) assuming the shortest path as the trajectory between one spot and another spot, there is no study to consider the user’s preferences against routes according to those contents. This paper proposes the formulation of the tourism navigation problem, which assigns all factors needed for a solution to only edges. The solution using simulated annealing is also proposed.
This research leads to the development of a system that generates minutes by using repeated structuring of discussion and summarizing between a user and the system. To realize the system, we propose a novel tree structure dedicated to representing the discussion structure of a meeting based on the relative importance of utterances underlain by the semantic structure of meeting records, by using verbal and non-verbal information. The system aims to grasp the point of the meeting efficiently by interactively manipulating this tree structure and giving different viewpoints to each system user. In the evaluation on important utterance extraction using the proposed system, it was shown that the score of ROUGE-2 was higher than that of the existing document digest technique. Moreover, in the evaluation by subjects conducted by 10 subjects, it was shown that the system can grasp conference contents efficiently, and the usefulness of the system.
By using the skin tone modifier adopted in Unicode 8.0, it is possible to change the skin color of an emoji related to a person. In order to investigate the applicability to user profile estimation, this study analyzed the usage of skin tone modifiers on Twitter. As a result, the proportion of users using the same kind of skin tone modifiers as the one used on their profile for their own tweets is high. We also found that there are many replies to users who use skin tone modifiers of the same type.
In large scale disasters, many evacuation sites should be set up immediately, and relief goods including foods should be being supplied to every site for a certain period. However, in large scale disaster, it is difficult to collect reliable information and to distribute necessary goods to appropriate site in allowable timing. To solve this problem, we employ the response threshold model for ant colonies, and we propose a sustainable relief goods distribution system in large scale disaster. To demonstrate the effectiveness of the proposed system, simulation was conducted.
In this paper, we aim to extract frequent sub-sequences from a given long sequence. Especially, the method satisfies following four requirements: (1) online learning, (2) extracting plural sub-sequences, (3) extracting various length sub-sequences, and (4) controlling threshold related to frequency. The proposed method uses a 2-blocks neural network. The network consists of spiking neurons based on leaky integrate and fire (LIF) model and is trained by the method based on spike timing dependency plasticity (STDP). As a result, the network extracted sub-sequences whose frequency is more than a certain threshold that is determined by only one parameter. Concretely, the network extracted three-symbols-length subsequences from 3,000 length sequence. In this case, sub-sequences appeared frequency of 0.4%, 3%, or 5%, and the network extracted 3%-and-more sub-sequences, or only 5% sub-sequences by controlling only one parameter.
We aim to develop a swimming motion coaching system for beginner and/or intermediate swimmers using a single inertial sensor. One of the requirements of the system is the process of automatically estimating and dividing the section of swimming motions (such as stroke and turn) from the sensor data. In the previous study which performed automatic estimation of the swimming motion by non ensemble learning, it was impossible to remove the different motion patterns by individuals, and the generalization ability was low. In this paper, in order to learn a common pattern in each motion and realize the motion estimation with high accuracy, we proposed an estimation method of the turn section by using random forest which is one of ensemble learning. As a result, it was suggested that the turn section could be estimated with higher accuracy than non ensemble learning method in all four swimming styles.
In our previous researches, we discussed a method based on of extracting both of a similarities/style and differences/characteristic component from walking motions using four wearable motion sensors. This method suggested that data with segmented walking motion could be used to identify individuals. However, we did not discuss the physical meanings of the similarities/style and differences/characteristic components of a subject from the walking data. In this paper, we discuss a method of determining which segment data contributes to the similarities/style and differences/characteristic components, toward the understanding of the physical meanings of the similarities/style and differences/characteristic components of a subject from the walking data.
This paper deals with compare of distribution of gaze pattern during walking between regular visitors and first-time visitors. An experiment was carried out in which participants walk through a street with an eye tracker to measure gaze pattern. As a result, it is found that distribution of gaze pattern of first-time visitors is wider than regular visitors. Further, it was revealed that regular visitors tend to look into spaces in front of them whereas first-time visitors tend to look into open spaces in side of them. This is because first-time visitors distribute attention to grasp the situation of spaces in side of them as compared to regular visitors.
In this paper, we aim to improve the response to successive input patterns in SpikeProp, which is a kind of spiking neural networks. We proposed two methods: (1) change the spike response function, which decides network behavior, and (2) train combined patterns. The response was improved by using both methods. Concretely, we got 70% of success rate for successive inputs on the network trained by proposed methods in the case that the network trained by the previous method failed.