This paper reports our survey activities related to the practical use of big data. (1) We have surveyed the developing trend, elemental technologies, and application cases of the practical use of big data. (2) We have clarified the relationship between the types of benefits gained from big data and the required elemental technologies, and provided a guideline for selecting big data technologies adequate for producing each type of benefit.
Preventive medicine has attracted much attention due to increase of medical expenses in Japan. As a preventive measurement against diseases, this paper presents a system for predicting three types of moods namely good, normal, and bad using biological and weather information. Specifically, we focus on analyzing factors for the prediction. Evaluation experiments demonstrate that the proposed system predicts the tomorrow's mood with 73% accuracy by learning biological information, weather information, and moods in the past with multiple classifiers. Moreover, we found that body fat, maximal pressure, and amount of insolation are important factors for the prediction. The evaluation experiments indicate that the proposed system is useful for the prevention of bipolar disorder and lifestyle diseases.
A facility maintenance support system featuring annotation interpretation in addition to digital pen and paper-digital document management technologies is proposed. Handwritten annotation consists of character strings and drawings like lead lines. Annotation interpretation extracts information about which part of the facility, and when the maintenance worker's insights are described from the handwritten annotation on the maintenance sheet. The annotated facility is estimated by using printed object information provided from paper-digital document management. Time stamp of annotation is acquired from digital pen. Based on the interpretation, character string images are organized in time series manner by part of the facility, and are presented to the maintenance workers both on the monitor in the office and on the maintenance sheet in the field. Proposed facility maintenance support system was applied to the maintenance work for the wastewater plant in our research laboratory. The field trial reveals that sharing knowledge on the maintenance workers' insights can improve efficiency and reliability of the maintenance work based on the interview to the maintenance workers.
In Japan, ATMs or vending machines with a banknote handling device are widely used. The fatigued banknotes is the main reason of paper jam problems in those machines. Therefore a technique has demanded in order to distinguish fatigued banknotes more efficiently. In this paper, we propose a method to solve this problem by extracting the features of fatigued bills that is not depend greatly on the classifier. The method which extract the feature quantity of the frequency spectral difference from acoustic signals of banknotes. This feature quantity is possible to cope with time-series change of the acoustic signal by the fatigued degree of banknotes. And the discrimination experiments of fatigued banknotes were performed by Support Vector Machine(SVM) with the frequency spectral difference as input. The results showed that this feature quantity is effective to classify the fatigued banknotes.
We develop an intelligent tutoring system on learners' answers to problems in case-based e-learning. A facilitator instantiates answers and tutoring advice as a tutoring rule preliminary, and the system automatically identifies an appropriate instantiated answer which corresponds to the input sentence of an answer from the learner. Although various kinds of tutoring rules are given on a certain problem, the instantiated answers are very similar to each other among tutoring rules, even if tutoring rules are different. So the input sentence is similar to the wrong instantiated answer of the tutoring rule, which makes it difficult to select the tutoring rule correctly. The proposed method selects the tutoring rule for the input sentence by machine learning of selecting the tutoring rules with the multi-class SVM (Support Vector Machine). The experimental result shows that the proposed method can improve the accuracy of selecting tutoring rules by 21% compared to the method that selects the most similar instantiated answer.
While the market investigation is important in game software development, there is a problem that there is no effective way to pursue the factor of user's software evaluation. In this research, we paid attentions to corpus (electric existence of documents) considered that the factor relationships about the user's evaluation were expressed potentially as their opinions. As the way to achieve this idea, we tried to extract useful knowledge by using SEM and topic model for visual and quantitative analysis process. Experimental results showed that our proposal process can extract effectively the topics that users pay attentions when they evaluate the game software and we can interpret it.
For the purpose of support of action decision after anomaly detection, anomaly-related sensor determination method has been developed. In this method, 2-dimensional distribution densities of normal data for every pair of sensors are obtained beforehand, and the pairs of sensors on which the distribution of anomaly data is apart from that of normal data are automatically searched. To find such sensors, “isolation index” is calculated for each sensor by utilizing image processing. The proposed method was evaluated using real data of generators and semiconductor manufacturing equipment. It was confirmed that sensors related with each detected anomaly can be determined correctly. Anomaly phenomena such as a timing gap between sensor signals in transition, a break of synchronism of sensor signals and so on was recognized by investigating sensor signals determined as anomaly-related sensors.
We propose a method of high throughput file level deduplication for primary file servers, called partial data background pre-fetch (PDBP). To achieve high throughput of deduplication, the method reduces the number of disk I/Os issued during deduplication process. Before running deduplication process, the proposed method pre-fetches a part of data of shared files referred by deduplicated files. After that, the method processes the files that are larger than a file size threshold defined by administrators. In this paper, we evaluate a deduplication processing time by using a simulation model of PDBP. Consequently, we confirm that the processing time of PDBP is reduced by about 50 % compared to a conventional file deduplication method when the threshold is set to 4 KB.
This paper presents an approach for improving the efficiency of solving linear systems by applying a genetic algorithm (GA) to the GMRES(m) method. For every restart process in GMRES(m), the initial vectors are regarded as chromosomes. When the restart process stagnates, the GA performs a crossover on chromosomes to create new chromosomes for the next restart stage, in which a weighted average algorithm is used to perform the crossover process effectively. To further enhance the performance, the concept of “chromosome-wide stagnation” is introduced by enabling on-the-fly detection of a slowdown in convergence of the GA. A possible way to adjust the m value automatically at the onset of such stagnation is proposed. The proposed method had been tested on several sample matrices and showed satisfactory improvements in execution time.
This paper proposes a new method that can recognize both activities and gestures by using acceleration data. While both activity recognition techniques and gesture recognition techniques employ acceleration data, these techniques are studied independently due to the large difference between the characteristics of activity sensor data and gesture sensor data. In this study, we combine their recognition using several weak classifiers that are widely used to recognize activities and/or gestures (e.g., FFT-based and DTW-based classifiers).
This paper proposes chance index which estimates whether a node is chance in a co-occurrence network. Recently, chance discovery researches are attractive for several domains. By using chance discovery, we can develop new business or predict earthquake. However, there is a problem that chance discovery requires analysts' inference from visualized network so that success and failure of chance discovery depend on analysts. In order to solve this problem, we analyzed the features from previous chance discovery researches and build the two hypotheses: (1) chance nodes have high betweenness centrality and (2) chance nodes connect to others with weak links. Based on the hypotheses, chance index is formulated by two terms about between centrality and the strength of links. We confirm the usability of chance index from verification experiments using bush network, questionnaire network, interview network and editorial network.
In a large scale system like building air-conditioning system, measured time-series data is observed from many kinds of sensors. It is difficult to detect the fault by the administrators because only the limited experts can diagnose the unusual system. For this reasons, a new method is required, which can detect the fault from the measured data using a computer automatically. This paper proposes the method of fault detection based on information extraction from measured time-series data in a building air-conditioning system. Fault in building air conditioning system make data generate condition “hunching”, which consists of repetition of rises and descents. The proposal method converts target measured time-series data into frequency components in order to extract condition “hunching”, and detect fault by Discriminant Analysis. Through practical experiments, it is confirmed that the proposal method can detect all faults as well as fault diagnosis using expert knowledge in a building air-conditioning system.
Mobile devices, which can sense their locations by GPS or Wi-Fi, have become popular these days, and we can collect and analyze location information of many users to examine traffic flow, conduct marketing analysis, and so on. However, several users hesitate to provide their accurate location information. Therefore, researches which anonymize user's location information on their devices and send the anonymized information to the data collection server have been proposed. These researches can protect user's privacy and let the data collection server to estimate the distribution of users' locations by a statistical way. However, they need many users to help with the data collection. In our proposed method each user sends several dummy locations to the data collection server and the server can estimate the location distribution with high accuracy. By mathematical analysis and simulations, we prove our proposed method can reduce the estimated errors by approximately from 85% to 99%.
Automatically dynamic measurement systems to measure object weights are used in logistics and food industry, etc. The systems are composed of belt conveyor and road cell, and is called checkweigher. Checkweighers have a problem that noises of error factor change according to speed of belt conveyor and installation surrounding. As a result, variable filters are better suited for checkweighers to reduce the noises after installation. In this paper, we propose a design method of quasi-equiripple variable linear-phase FIR (Finite Impulse Response) filters using iterative weighted least squares method in the frequency domain. The proposed variable filter has some piecewise high attenuations in the stopband, and can change multi-factor in the stopband characteristics and some notch frequencies using variable parameters. In the proposed variable filter, filter coefficients are approximated by polynomial using polynomial coefficients and variable parameters. The number of the polynomial coefficients increases when variable parameters are the plural number or polynomial orders are high. Therefore, we also propose a reduction method of the polynomial coefficients. The usefulness of the proposed design method is verified through the examples.
In this paper, a new method is proposed to detect broken rotor bars in a cage induction motor. In this method, characteristic frequency components are extracted by a frequency analysis, and they are displayed on a two-dimensional feature distribution. Based on this feature distribution, a probability of broken rotor bar, which is used in the diagnostic indicators, is derived, and the condition of the rotor is diagnosed by the probability. The effectiveness of the proposed diagnosis method is verified by performing some experiments.
In this paper, the methods of consistent estimation for identification of linear discrete-time system in the presence of input and output noises, which is usually called “errors-in-variables” (EIV) models, are studied. It is well known that the least squares (LS) method gives biased parameter estimates for EIV situations. To solve this bias problem, the instrumental variable (IV) methods and the least correlation (LC) method are often used. The IV and LC based methods can be applied in more general noise conditions, but these methods suffer from poor accuracy of the estimated parameters because the coefficient matrix of these methods may often become ill-conditioned. In order to obtain numerically stable estimates, the methods presented in this paper use the biased extended LC (XLC) estimates. The biased XLC estimates can be defined by using the extended vectors and the pre-filters. According to the bias compensation principle (BCP) technique, the proposed bias-compensated XLC (BCXLC) methods are developed. The way to reduce the computational load is examined. The results of simulated examples indicate that the proposed methods provide numerically stable and good estimates.
We study the problem of recovering the structure of indoor scene from a single image. So far nearly all existing methods either using conventional images or omnidirectional images involve just partial scene, which leads to the recovered spatial layouts referring to incomplete structures. In this paper, these cases are summarized as “open geometry”. In order to obtain the entire structure, we impose an analogous description to present geometric constrains of a full-view image, called “close geometry”. Then, we propose a new model to explore indoor scene understanding from a single full-view image. A novel method is also given to estimate the structure of rooms by searching for the structure which fits the extracted line segments best. The experiments demonstrate that our approach based on close geometry can exhibit a good performance to estimate indoor structure.
In recent years, technological developments of electronic circuit have made it possible to detect odors. There has been considerable interest in odor sensors in various fields. In our research, we utilize Quartz Crystal Microbalance sensors as odor sensors because they are inexpensive and have similar properties of human nose. Although an odor is detected by combining a lot of Quartz Crystal Microbalance sensors having different performances for odor detection, efficient algorithm for selection of these sensors is discussed insufficiently in this research field since the detection task with them is time consuming. To solve this problem, in this paper, we propose a sensor selection algorithm with Ant Colony Optimization and Support Vector Machine. The characteristic point of the proposed method is that it enables us to obtain an efficient set of sensors for detection of an odor rapidly. From experimental results with 22 Quartz Crystal Microbalance sensors, we confirmed our proposed method allowed us to obtain an efficient set of sensors for detection of odors.
In reality, several types of uncertainties should be considered for production scheduling, and robust scheduling is a method that enable uncertainty to be taken into account. In this paper, an enhanced technique of robust scheduling in manufacturing system is proposed to handle uncertain processing times factor. Effectiveness of the proposed technique is evaluated through a case study of Flexible Job-Shop scheduling problem (FJSP) with uncertain job processing time. This paper proposes a robust scheduling method of FJSP which consists of hybridized Genetic Algorithm (GA) and Binary Particle Swarm Optimization (BPSO) named HGABPSO. It utilizes scenarios of routing and sequences to find schedules that are confident and less sensitive against processing time uncertainties. A bi-objective evaluation measure of robust schedule is defined as minimizing the expected makespan under possible scenarios and also minimizing variability of it. Computational results indicated that the proposed method produces better solutions in comparison with a conventional method regarding the measure of robustness under different problem sizes and different levels of uncertainty for job processing time.
In this letter, a new technique which adaptively adjusts a parameter of Cuckoo Search (CS) is constructed after the basic parameter analysis for CS is qualitatively carried out. The performance of the proposed CS with adaptive parameter adjusting function is verified through numerical simulations for three types of typical benchmark problems.