To improve the cutting skills of learners, we developed a method for improving the skill involved in creating paper cuttings based on a steering task in the field of human-computer interaction. TaWe made patterns using the white and black boundaries that make up a picture. The index of difficulty (ID) is a numerical value based on the width and distance of the steering law. First, we evaluated novice and expert pattern-cutters, and measured their moving time (MT), error rate, and compliance with the steering law, confirming that the MT and error rate are affected by pattern width and distance. Moreover, we quantified the skills of novices and experts using ID and MT based models. We then observed changes in the cutting skills of novices who practiced with various widths and evaluated the impact of the difficulty level on skill improvement. Patterns considered to be moderately difficult for novices led to a significant improvement in skills.
Most existing methods of effort estimations in software development are manual, labor-intensive and subjective, resulting in overestimation with bidding fail, and underestimation with money loss. This paper investigates effectiveness of sequence models on estimating development effort, in the form of man-months, from software project data. Four architectures; (1) Average word-vector with Multi-layer Perceptron (MLP), (2) Average word-vector with Support Vector Regression (SVR), (3) Gated Recurrent Unit (GRU) sequence model, and (4) Long short-term memory (LSTM) sequence model are compared in terms of man-months difference. The approach is evaluated using two datasets; ISEM (1,573 English software project descriptions) and ISBSG (9,100 software projects data), where the former is a raw text and the latter is a structured data table explained the characteristic of a software project. The LSTM sequence model achieves the lowest and the second lowest mean absolute errors, which are 0.705 and 14.077 man-months for ISEM and ISBSG datasets respectively. The MLP model achieves the lowest mean absolute errors which is 14.069 for ISBSG datasets.
The outcome of document clustering depends on the scheme used to assign a weight to each term in a document. While recent works have tried to use distributions related to class to enhance the discrimination ability. It is worth exploring whether a deviation approach or an entropy approach is more effective. This paper presents a comparison between deviation-based distribution and entropy-based distribution as constraints in term weighting. In addition, their potential combinations are investigated to find optimal solutions in guiding the clustering process. In the experiments, the seeded k-means method is used for clustering, and the performances of deviation-based, entropy-based, and hybrid approaches, are analyzed using two English and one Thai text datasets. The result showed that the deviation-based distribution outperformed the entropy-based distribution, and a suitable combination of these distributions increases the clustering accuracy by 10%.
This paper presents a compromising strategy based on constraint relaxation for automated negotiating agents in the nonlinear utility domain. Automated negotiating agents have been studied widely and are one of the key technologies for a future society in which multiple heterogeneous agents act collaboratively and competitively in order to help humans perform daily activities. A pressing issue is that most of the proposed negotiating agents utilize an ad-hoc compromising process, in which they basically just adjust/reduce a threshold to forcibly accept their opponents' offers. Because the threshold is just reduced and the agent just accepts the offer since the value is more than the threshold, it is very difficult to show how and what the agent conceded even after an agreement has been reached. To address this issue, we describe an explainable concession process using a constraint relaxation process. In this process, an agent changes its belief by relaxing constraints, i.e., removing constraints, so that it can accept it is the opponent's offer. We also propose three types of compromising strategies. Experimental results demonstrate that these strategies are efficient.
Depression is a major mental health problem in Thailand. The depression rates have been rapidly increasing. Over 1.17 million Thai people suffer from this mental illness. It is important that a reliable depression screening tool is made available so that depression could be early detected. Given Facebook is the most popular social network platform in Thailand, it could be a large-scale resource to develop a depression detection tool. This research employs techniques to develop a depression detection algorithm for the Thai language on Facebook where people use it as a tool for sharing opinions, feelings, and life events. To establish the reliable result, Thai Mental Health Questionnaire (TMHQ), a standardized psychological inventory that measures major mental health problems including depression. Depression scale of the TMHQ comprises of 20 items, is used as the baseline for concluding the result. Furthermore, this study also aims to do factor analysis and reduce the number of depression items. Data was collected from over 600 Facebook users. Descriptive statistics, Exploratory Factor Analysis, and Internal consistency were conducted. Results provide the optimized version of the TMHQ-depression that contain 9 items. The 9 items are categorized into four factors which are suicidal ideation, sleep problems, anhedonic, and guilty feelings. Internal consistency analysis shows that this short version of the TMHQ-depression has good to excellent reliability (Cronbach's alpha >.80). The findings suggest that this optimized TMHQ-depression questionnaire holds a good psychometric property and can be used for depression detection.
Objective interestingness measures play a vital role in association rule mining of a large-scaled database because they are used for extracting, filtering, and ranking the patterns. In the past, several measures have been proposed but their similarities or relations are not sufficiently explored. This work investigates sixty-one objective interestingness measures on the pattern of A → B, to analyze their similarity and dissimilarity as well as their relationship. Three-probability patterns, P(A), P(B), and P(AB), are enumerated in both linear and exponential scales and each measure's values of those conditions are calculated, forming synthesis data for investigation. The behavior of each measure is explored by pairwise comparison based on these three-probability patterns. The relationship among the sixty-one interestingness measures has been characterized with correlation analysis and association rule mining. In the experiment, relationships are summarized using heat-map and association rule mined. As the result, selection of an appropriate interestingness measure can be realized using the generated heat-map and association rules.
End-to-end delay, aiming to realize how much time it will take for a traffic load generated by a Mobile Node (MN) to reach Sink Node (SN), is a principal objective of most new trends in a Wireless Sensor Network (WSN). It has a direct link towards understanding the minimum time delay expected where the packet sent by MN can take to be received by SN. Most importantly, knowing the average minimum transmission time limit is a crucial piece of information in determining the future output of the network and the kind of technologies implemented. In this paper, we take network load and transmission delay issues into account in estimating the Average Minimum Time Limit (AMTL) needed for a health operating cognitive WSN. To further estimate the AMTL based on network load, an end-to-end delay analysis mechanism is presented and considers the total delay (service, queue, ACK, and MAC). This work is proposed to answer the AMTL needed before implementing any cognitive based WSN algorithms. Various time intervals and cogitative channel usage with different application payload are used for the result analysis. Through extensive simulations, our mechanism is able to identify the average time intervals needed depending on the load and MN broadcast interval in any cognitive WSN.
We propose an idea generation support system known as the “GUNGEN-Heartbeat” that uses heartbeat variations for creating high quality ideas during brainstorming. This system shows “An indication of a check list” or “An indication to promote deep breathing” at time beyond a value with variance of heart rates. We also carried out comparison experiments to evaluate the usefulness of the system.
For embedded systems, verifying both real-time properties and logical validity are important. The embedded system is not only required to the accurate operation but also required to strictly real-time properties. To verify real-time properties is a key problem in model checking. In order to verify real-time properties of assembly program, we develop the simulator to propose the model checking method for verifying assembly programs. Simultaneously, we propose a timed Kripke structure and implement the simulator of the robot's processor to be verified. We propose the timed Kripke structure including the execution time which extends Kripke structure. For the input assembly program, the simulator generates timed Kripke structure by dynamic program analysis. Also, we implement model checker after generating timed Kripke structure in order to verify whether timed Kripke structure satisfies RTCTL formulas. Finally, to evaluate a proposed method, we conduct experiments with the implementation of the verification system. To solve the real problem, we have experimented with real microcontroller software.
The triaxial accelerometer is one of the most important sensors for human activity recognition (HAR). It has been observed that the relations between the axes of a triaxial accelerometer plays a significant role in improving the accuracy of activity recognition. However, the existing research rarely focuses on these relations, but rather on the fusion of multiple sensors. In this paper, we propose a data fusion-based convolutional neural network (CNN) approach to effectively use the relations between the axes. We design a single-channel data fusion method and multichannel data fusion method in consideration of the diversified formats of sensor data. After obtaining the fused data, a CNN is used to extract the features and perform classification. The experiments show that the proposed approach has an advantage over the CNN in accuracy. Moreover, the single-channel model achieves an accuracy of 98.83% with the WISDM dataset, which is higher than that of state-of-the-art methods.
Due to the superior performance, deep learning has been widely applied to various applications, including image classification, bioinformatics, and cybersecurity. Nevertheless, the research investigations on deep learning in the adversarial environment are still on their preliminary stage. The emerging adversarial learning methods, e.g., generative adversarial networks, have introduced two vital questions: to what degree the security of deep learning with the presence of adversarial examples is; how to evaluate the performance of deep learning models in adversarial environment, thus, to raise security advice such that the selected application system based on deep learning is resistant to adversarial examples. To see the answers, we leverage image classification as an example application scenario to propose a framework of Evaluating Deep Learning for Image Classification (EDLIC) to conduct comprehensively quantitative analysis. Moreover, we introduce a set of evaluating metrics to measure the performance of different attacking and defensive techniques. After that, we conduct extensive experiments towards the performance of deep learning for image classification under different adversarial environments to validate the scalability of EDLIC. Finally, we give some advice about the selection of deep learning models for image classification based on these comparative results.
In order not to disrupt a team member concentrating on his/her own task, the interrupter needs to wait for a proper time. In this research, we examined the feasibility of predicting prospective interruptible times of office workers who use PCs. An analysis of actual working data collected from 13 participants revealed the relationship between uninterruptible durations and four features, i.e. type of application software, rate of PC operation activity, activity ratio between keystrokes and mouse clicks, and switching frequency of application software. On the basis of these results, we developed a probabilistic work continuance model whose probability changes according to the four features. The leave-one-out cross-validation indicated positive correlations between the actual and the predicted durations. The medians of the actual and the predicted durations were 539 s and 519 s. The main contribution of this study is the demonstration of the feasibility to predict uninterruptible durations in an actual working scenario.
The two issues of art image creation and data hiding are integrated into one and solved by a single approach in this study. An automatic method for generating a new type of computer art, called stained glass image, which imitates the stained-glass window picture, is proposed. The method is based on the use of a tree structure for region growing to construct the art image. Also proposed is a data hiding method which utilizes a general feature of the tree structure, namely, number of tree nodes, to encode the data to be embedded. The method can be modified for uses in three information protection applications, namely, covert communication, watermarking, and image authentication. Besides the artistic stego-image content which may distract the hacker's attention to the hidden data, data security is also considered by randomizing both the input data and the seed locations for region growing, yielding a stego-image which is robust against the hacker's attacks. Good experimental results proving the feasibility of the proposed methods are also included.
Since deep learning was introduced, a series of achievements has been published in the field of automatic machine translation (MT). However, Korean-Vietnamese MT systems face many challenges because of a lack of data, multiple meanings of individual words, and grammatical diversity that depends on context. Therefore, the quality of Korean-Vietnamese MT systems is still sub-optimal. This paper discusses a method for applying Named Entity Recognition (NER) and Part-of-Speech (POS) tagging to Vietnamese sentences to improve the performance of Korean-Vietnamese MT systems. In terms of implementation, we used a tool to tag NER and POS in Vietnamese sentences. In addition, we had access to a Korean-Vietnamese parallel corpus with more than 450K paired sentences from our previous research paper. The experimental results indicate that tagging NER and POS in Vietnamese sentences can improve the quality of Korean-Vietnamese Neural MT (NMT) in terms of the Bi-Lingual Evaluation Understudy (BLEU) and Translation Error Rate (TER) score. On average, our MT system improved by 1.21 BLEU points or 2.33 TER scores after applying both NER and POS tagging to the Vietnamese corpus. Due to the structural features of language, the MT systems in the Korean to Vietnamese direction always give better BLEU and TER results than translation machines in the reverse direction.
This paper presents a Siamese architecture model with two identical Convolutional Neural Networks (CNNs) to identify code clones; two code fragments are represented as Abstract Syntax Trees (ASTs), CNN-based subnetworks extract feature vectors from the ASTs of pairwise code fragments, and the output layer produces how similar or dissimilar they are. Experimental results demonstrate that CNN-based feature extraction is effective in detecting code clones at source code or bytecode levels.
Malicious attackers on the Internet use automated attack programs to disrupt the use of services via mass spamming, unnecessary bulletin boarding, and account creation. Completely automated public turing test to tell computers and humans apart (CAPTCHA) is used as a security solution to prevent such automated attacks. CAPTCHA is a system that determines whether the user is a machine or a person by providing distorted letters, voices, and images that only humans can understand. However, new attack techniques such as optical character recognition (OCR) and deep neural networks (DNN) have been used to bypass CAPTCHA. In this paper, we propose a method to generate CAPTCHA images by using the fast-gradient sign method (FGSM), iterative FGSM (I-FGSM), and the DeepFool method. We used the CAPTCHA image provided by python as the dataset and Tensorflow as the machine learning library. The experimental results show that the CAPTCHA image generated via FGSM, I-FGSM, and DeepFool methods exhibits a 0% recognition rate with ε=0.15 for FGSM, a 0% recognition rate with α=0.1 with 50 iterations for I-FGSM, and a 45% recognition rate with 150 iterations for the DeepFool method.
We propose a multi-targeted backdoor that misleads different models to different classes. The method trains multiple models with data that include specific triggers that will be misclassified by different models into different classes. For example, an attacker can use a single multi-targeted backdoor sample to make model A recognize it as a stop sign, model B as a left-turn sign, model C as a right-turn sign, and model D as a U-turn sign. We used MNIST and Fashion-MNIST as experimental datasets and Tensorflow as a machine learning library. Experimental results show that the proposed method with a trigger can cause misclassification as different classes by different models with a 100% attack success rate on MNIST and Fashion-MNIST while maintaining the 97.18% and 91.1% accuracy, respectively, on data without a trigger.
Internal user threats such as information leakage or system destruction can cause significant damage to the organization, however it is very difficult to prevent or detect this attack in advance. In this paper, we propose an anomaly-based insider threat detection method with local features and global statistics over the assumption that a user shows different patterns from regular behaviors during harmful actions. We experimentally show that our detection mechanism can achieve superior performance compared to the state of the art approaches for CMU CERT dataset.
This letter reveals that an edge-triggered master-slave flip-flop (FF) using well-known soft error tolerant DICE (dual interlocked storage cell) is vulnerable to soft errors occurring around clock edge. This letter presents a design of a soft error tolerant FF based on the master-slave FF using DICE. The proposed design modifies the connection between the master and slave latches to make the FF not vulnerable to these errors. The hardware overhead is almost the same as that for the original edge-triggered FF using the DICE.
We propose an effective 2d image based end-to-end deep learning model for malware detection by introducing a black & white embedding to reserve bit information and adapting the convolution architecture. Experimental results show that our proposed scheme can achieve superior performance in both of training and testing data sets compared to well-known image recognition deep learning models (VGG and ResNet).
To prevent proxy-test taking among examinees in unsynchronized e-Testing, a previous work proposed an online handwriting authentication. That method was limited to applied for end of each answer. For free response tests that needed to authenticate throughout the answer, we used the Bayesian prior information to examine a sequential handwriting authentication procedure. The evaluation results indicate that the accuracy of this procedure is higher than the previous method in examinees authentication during mathematics exam with referring the Chinese character.
This article introduces our investigation on learning state estimation in e-learning on the condition that visual observation and recording of a learner's behaviors is possible. In this research, we examined methods of adaptation for a new learner for whom a small number of ground truth data can be obtained.
In this paper, we propose a salient region detection method with multi-feature fusion and edge constraint. First, an image feature extraction and fusion network based on dense connection structure and multi-channel convolution channel is designed. Then, a multi-scale atrous convolution block is applied to enlarge reception field. Finally, to increase accuracy, a combined loss function including classified loss and edge loss is built for multi-task training. Experimental results verify the effectiveness of the proposed method.