Personalization of activity recognition has become a topic of interest to improve recognition performance for diverse users. Recent researches show that deep neural networks improve generalization performance in activity recognition using inertial sensors, such as accelerometers and gyroscopes; however, personalizing deep neural networks is challenging because it has a thousands or millions of parameters but generally personalization should be done with small amount of labeled data.
This paper proposes novel way to personalize deep neural networks by preventing overfitting using un-labeled data. This is done by adding output-distribution similarity regularization between the reference model and personalized models, which is an extension of distillation recently proposed by Hinton. Experiments on an opportunity activity recognition dataset, one of the most famous datasets in the fields, demonstrates that the proposed regularization techniques prevent overfitting even if we have few labeled data for each target classes per users, and provide better recognition performances compared with other personalization techniques. We also conduct various experiments, including a no-labeled data setting and combination of the proposed method and well-used personalization techniques to check if the proposed method is complementary with existing methods or competitor of its. The results suggest that the proposed regularization works well in various settings, and complementary with existing methods.
Current crowdsourcing platforms such as Amazon Mechanical Turk provide an attractive solution Crowdsourcing platforms provide an attractive solution for processing numerous tasks at a low cost. However, insufficient quality control remains a major concern. Therefore, we developed a private crowdsourcing system that allows us to devise quality control methods. In the present study, we propose a grade-based training method for workers in order to avoid simple exclusion of low-quality workers and shrinkage of the crowdsourcing market in the near future. Our training method utilizes probabilistic networks to estimate correlations between tasks based on workers’ records for 18.5 million tasks and then allocates pre-learning tasks to the workers to raise the accuracy of target tasks according to the task correlations. In an experiment, the method automatically allocated 31 pre-learning task categories for 9 target task categories, and after the training of the pre-learning tasks, we confirmed that the accuracy of the target tasks was raised by 7.8 points on average. This result was comparatively higher than those of pre-learning tasks allocated using other methods, such as decision trees. We thus confirmed that the task correlations can be estimated using a large amount of worker records, and that these are useful for the grade-based training of low-quality workers.
Deep Convolutional Neural Networks (CNNs) have achieved great success in many computer vision tasks. However, it is still difficult to use them in practical tasks, especially small scale tasks, because of the large quantity of labeled training data that are required in their training process. In this paper, we present two approaches to enable easy adaption of CNNs in small scale tasks: the Minimum Entropy Loss (MEL) approach and the Minimum Reconstruction Error (MRE) approach. The basic idea of these two approaches is to select informative filters in pre-trained CNN models, and reuse them to initialize CNNs that are designed for small scale tasks. Different with popular fine-turning approach which also reuses pre-trained CNNs by conducting further training without changing their model architectures, MEL and MRE lead to an easy usage of pre-trained models in novel model architectures. This makes it a high flexibility when dealing with small scale tasks. We evaluated the performance of the two approaches on practicalsmall scale tasks, and confirmed their high performance and high flexibility.
This paper focuses on a classification problem for volatile time series. One of the most popular approaches for time series classification is dynamic time warping and feature-based machine learning architectures. In many previous studies, these algorithms have performed satisfactorily on various datasets. However, most of these methods are not suitable for chaotic time series because the superficial changes in measured values are not essential for chaotic time series. In general, most time series datasets include both chaotic and non-chaotic time series; thus, it is necessary to extract the more essential features of a time series. In this paper, we propose a new approach for volatile time series classification. Our approach generates a novel feature by extracting the structure of the attractor using topological data analysis to represent the transition rules of the time series. As this feature represents the essential property of systems of the time series, our approach is effective for both chaotic and non-chaotic types. We applied a learning architecture inspired by a convolutional neural network to this feature and found that the proposed approach improves performance in a human activity recognition problem by 18.5% compared with conventional approaches.
In this paper, a new local search approach using a search history in evolutionary multi-criterion optimization (EMO) is proposed. This approach was designed by two opposite mechanisms (escaping from local optima and convergence search) and assumed to incorporate these into an usual EMO algorithm for strengthening its search ability. The main feature of this approach is to perform a high efficient search by changing these mechanisms according to the search condition. If the search situation seems to be stagnated, escape mechanism would be applied for shifting search point from this one to another one. On the other hand, if it observes no sign of the improvement of solutions after repeating this escape mechanism for a fixed period, convergence mechanism is applied to improve the quality of solution through an intensive local search. This paper presents a new approach, called “escaping from local optima and convergence mechanisms based on search history - SPLASH -”. Experimental results showed the effectiveness of SPLASH and the workings of SPLASH’s two mechanisms using WFG test suites.
The subtree kernel and the information tree kernel defined here permit us to measure the syntactic characteristics and similarity of sentences. The subtree kernel is the total number of the common subtrees in two trees and the information tree kernel is defined as the total Shannon information contents contained in the common subtrees. The information tree kernel enables us to capture such structural characteristics peculiar to the styles of writers. The analyses using by these kernels reveal some syntactic characteristics and similarities of the Japanese 31 authors’ writing styles. In particular, the results of the analyses for the great five authors, Soseki Natume, Ryunosuke Akutagawa, Osamu Dazai, Nankiti Niimi, and Kenzi Miyazawa, show that, for example, (1) Natume more often writes a sentence of the dependency structure in which the same subtree structure occurs multiple times in the sentence. (2) Akutagawa more often uses the dependency structures for extra or detailed expressions that modifies a noun phrase than the others do. (3) Dazai often uses the dependency structures that consist of many shallow subtrees arranged in parallel, but the others seldom write sentences of the parallel subtree structures. (4) Niimi uses simpler dependency structures than Miyazawa does and Miyazawa writes short sentences in more various dependency structures.