In this paper, we propose a method for compiling travel information automatically. For the compilation, we focus on travel blog entries, which are defined as travel journals written by bloggers in diary form. We consider that travel blog entries are a useful information source for obtaining travel information, because many bloggers' travel experiences are written in this form. First, we identified travel blog entries in a blog database. Next, we extracted souvenir information and tourist spots information as travel information from them. Furthermore, we extracted hyperlinks from travel blog entries and constructed the collection of travel information links. We have confirmed the effectiveness of our method by experiment. For the identification of travel blog entries, we obtained scores of 38.1% for Recall and 86.7% for Precision. In the extraction of travel information from travel blog entries, we obtained 74.0% and 71.0% for Precisions at the top 100 extracted local products and tourist spots, respectively, and thereby confirming that travel blog entries are a useful source of travel information. In the construction of the collection of travel information links, we obtained high precision and recall.
This paper proposes “visual summary”, which visualizes topics and those transitions in a thread of BBS (Bulletin Board System) based on interactive information visualization technique so that a user can grasp topic transition in a thread without reading the corresponding comments. Recently, various types of information is available on the Web, among which CGM (consumer generated media) such as BBS, blog and SNS (Social Networking Service) is one of unique information on the Web and it could be used for marketing by companies as well as decision making by individuals. Although topic transition in a thread of BBS is also useful information, it is difficult for a user to read a number of comments in a thread. The visual summary proposed in the paper visualizes the topic transition with animation, based on co-occurrence of keywords within a certain time period. A user can analyze a thread interactively by changing time window and focused keywords. An experimental result with test participants shows the effectiveness of the proposed visual summary for identification of topic transition pattern and estimation of topics. The result of analyzing participants' behaviors revealed that the participants used the same clues as those used when grasping topic transition by reading threads. The result also revealed the existence of reasonable analyzing behavior independent of participants.
Text data with spatio-temporal information are becoming common with the popularization of mobile phones with a GPS function and micro blog services like Twitter. This research proposes a system supporting operators who controls an area in real-world. Examples of operators are managers of a facility like theme park and operators in a disaster prevention center. Our system has three functions: (i) automatic classicification which classifies messages into a fixed category, (ii) clustering which aggregates similar messages and (iii) burst detection which detects an event in which messages are arising in high frequency. We asked 120 people to send text data with spatio-temporal information by mobile phones in the Expo memorial park. We evaluated our system using the above data.
In this paper, we propose a method for describing a Japanese word, not with explaining or defining sentences, but with figurative descriptions. Utilizing a simile pattern, our method gathers a large number of noun-noun relations from the World Wide Web. On the basis of those relations and their statistical information, associative pieces of knowledge called descriptors are estimated. The descriptors, which describe a query word figuratively, are sorted by ranking in order of descriptive ability level with generality and locality. Moreover, combining property of figurative relation and some fixed patterns, the descriptors are classified into concept words, attribute words, and the others. As output, a set of sorted descriptors is shown with several types of output forms. Some experiments using a prototype system “Murasaki” have been conducted. The experimental results show that the fundamental performance of our method is significantly better than the bag-of-words approach. Additionally, the responsiveness for hot keywords on information retrieval web sites shows that the outcome of the evaluation had 60% precision, which exceeds that of a common dictionary. The method also functioned effectively in ranking performance (74% on MRR) and classification performance (63% accuracy). Furthermore, it is possible that the proposed method could be comparable to Wikipedia if steady coverage of the figurative descriptions for a query word could be ensured.
The paper proposes the automatic generation method of the phenotype-mapping space for the applications of iGA (interactive Genetic Algorithm). The proposed method constructs the search space for iGA from the degree of association between the solutions. The iGA system developers do not need to bear a burden of labeling each solution with the proposed method. The degree of association between the solutions is calculated from the preference information accumulated on the Web. Recently, online services such as shopping-sites and social-bookmark store a lot of users' action logs. These informations include the users' Kansei and preference. We call these types of informations collective preference. The users find their preference by iGA search in the space generated from the collective preference. To verify the effectiveness of proposed method, we obtained the degree of association between the products from the online shopping sites and made the phenotype-mapping space. This relationship reflects the users' buying behavior. It was confirmed that the proposed method is able to generate the space automatically by the analysis of the generated space. Furthermore, we performed the subjective experiments with the iGA system which simulated a shopping site. From the experimental result, it was verified that the subjects' search log in the generated space was personalized by their own preference.
In this study, we proposed a “Graffiti Map”, a map-style communication tool on which everyone can put and share areal information. We had a Graffiti Map in three different situations for investigating reactions, communication styles, and consciousness changes of local residents. Through the analysis of posted information, participants' behaviors, inverview data, and questionnaire data, we revealed that local residents had a desire for communication via map-related information across age boundaries from elementary students to elderly persons. We also showed that resident-led participation style also made participants look back their own areal information as well as harvest love for the area.
This paper discusses the application of the fuzzy c-means (FCM) based classifier to large scale data sets. The first type of the large scale data set is the one containing a huge number of samples (patterns). The number can be reduced by sampling, but the accuracy of the classifier on the test set may deteriorate, and the accuracy on the available data worsens. The FCM classifier uses covariance matrices whose size does not increase with the number of training samples, and the training time is proportional to the number of samples. Comparing with the support vector machine (SVM) classifier, which is known as one of the highest performance classifiers, the paper shows that the FCM classifier nearly attains the accuracy of SVM and surpasses it in the training time and the testing time. If the feature dimension of the samples is relatively small or the dimension can be reduced by principal component analysis (PCA), the training of the FCM classifier converges in a short period of time. But, if the feature dimension is large enough, the covariance matrices can't be stored in the computer memory and the computation is infeasible. So, the paper proposes a modified algorithm to cope with high dimensional feature data. As an example, a subset of COREL image database is used to compare the performance with the approach using PCA data set compression.
Ant Colony Optimization (ACO) is one of promising meta-heuristics for graph search such as shortest path planning and traveling salesman problems. In recent years, some attempts have shown that ACO algorithms are applicable to 0-1 Integer Programming Problems (0-1IP). ACO algorithms for 0-1IP are called Binary ACO (BACO) algorithms. Although it is predictable that balance between search exploitation and exploration is important in ACO for 0-1IP, no previous work has proposed an algorithm which adjusts the balance. This paper proposes a method which is designed by applying Queen Ant Strategy (ASqueen) to BACO algorithms. The proposed method has a prospect for finding well-qualified solutions due to its subpopulation structure and the search area adjustment by a queen ant. Experimental results in 0-1 Knapsack problems have shown that the search performance of the proposed BASqueen shows better than that of other BACO algorithms, Simulated Annealing and Discrete Particle Swarm Optimization.
This paper proposes a method for speeding up the process of generating fuzzy if-then rules for pattern classification problems. The proposed method makes use of GPGPUs' parallel implementation in order to reduce the computational time. CUDA, a development environment of GPGPU, includes a library to perform matrix operations in parallel. In the proposed method, the published source codes of the matrix multiplication are modified so that the membership values of given training patterns with antecedent fuzzy sets are calculated. In the computational experiments, it is shown that the computational time is reduced for those problems that require high computational efforts.
Electronic word-of-mouth (ewom) is one important information source that influences consumer purchase intentions. Previous works showed that the potency of ewom information, which does not match prior attitudes, tend to be limited. This paper investigates the limitation effects for the ewom information of product attributes from the viewpoint of two layers of consumer prior attitudes: toward products and toward their attributes. The experiment was designed in two parts: (1) subjects evaluated digital cameras using press releases from the makers to form each layer of prior attitudes, and (2) subjects changed their purchase intentions toward the cameras based on the ewom information of the product attributes. A two-way ANOVA with prior attitude toward the products and their attributes as the two factors was performed on the data set from questionnaire surveys of 152 university students. The results supported three hypotheses that reflected the limitation effects of two layers of prior attitudes.