Due to its on-demand and pay-as-you-go properties, cloud computing has become an attractive alternative for HPC applications. However, communication-intensive applications with complex communication patterns still cannot be performed efficiently on cloud platforms, which are equipped with MapReduce technologies, such as Hadoop and Spark. In particular, one major obstacle is that MapReduce's simple programming model cannot explicitly manipulate data transfers between compute nodes. Another obstacle is cloud's relatively poor network performance compared with traditional HPC platforms. The traditional Strassen's algorithm of square matrix multiplication has a recursive and complex pattern on the HPC platform. Therefore, it cannot be directly applied to the cloud platform. In this paper, we demonstrate how to make Strassen's algorithm with complex communication patterns “cloud-friendly”. By reorganizing Strassen's algorithm in an iterative pattern, we completely separate its computations and communications, making it fit to MapReduce programming model. By adopting a novel data/task parallel strategy, we solve Strassen's data dependency problems, making it well balanced. This is the first instance of Strassen's algorithm in MapReduce-style systems, which also matches Strassen's communication lower bound. Further experimental results show that it achieves a speedup ranging from 1.03× to 2.50× over the classical Θ(n3) algorithm. We believe the principle can be applied to many other complex scientific applications.
Gentle AdaBoost is widely used in object detection and pattern recognition due to its efficiency and stability. To focus on instances with small margins, Gentle AdaBoost assigns larger weights to these instances during the training. However, misclassification of small-margin instances can still occur, which will cause the weights of these instances to become larger and larger. Eventually, several large-weight instances might dominate the whole data distribution, encouraging Gentle AdaBoost to choose weak hypotheses that fit only these instances in the late training phase. This phenomenon, known as “classifier distortion”, degrades the generalization error and can easily lead to overfitting since the deviation of all selected weak hypotheses is increased by the late-selected ones. To solve this problem, we propose a new variant which we call “Penalized AdaBoost”. In each iteration, our approach not only penalizes the misclassification of instances with small margins but also restrains the weight increase for instances with minimal margins. Our method performs better than Gentle AdaBoost because it avoids the “classifier distortion” effectively. Experiments show that our method achieves far lower generalization errors and a similar training speed compared with Gentle AdaBoost.
Some unilateral lower-limb amputees, have through continued exertion, increase the foot reaction force of the sound leg. The asymmetric gait with a prosthetic leg may thus negatively affect the musculoskeletal health of the leg on the healthy side. Therefore, it is important for these amputees to learn how to adjust the balance of each foot load in training. The aim of this study is to develop a training support system visualizing floor-reaction forces using a color-depth sensor. The pose of the entire body of the amputee is measured by the depth sensor, and the floor reaction force is estimated based on Zero Moment Point (ZMP), which is calculated using the center of mass of the amputee. Evaluation experiments of the proposed method were performed and they confirmed the effectiveness of the estimation method and the training with the visualization of reaction force.
Appropriate turn-taking is important in spoken dialogue systems as well as generating correct responses. Especially if the dialogue features quick responses, a user utterance is often incorrectly segmented due to short pauses within it by voice activity detection (VAD). Incorrectly segmented utterances cause problems both in the automatic speech recognition (ASR) results and turn-taking: i.e., an incorrect VAD result leads to ASR errors and causes the system to start responding though the user is still speaking. We develop a method that performs a posteriori restoration for incorrectly segmented utterances and implement it as a plug-in for the MMDAgent open-source software. A crucial part of the method is to classify whether the restoration is required or not. We cast it as a binary classification problem of detecting originally single utterances from pairs of utterance fragments. Various features are used representing timing, prosody, and ASR result information. Experiments show that the proposed method outperformed a baseline with manually-selected features by 4.8% and 3.9% in cross-domain evaluations with two domains. More detailed analysis revealed that the dominant and domain-independent features were utterance intervals and results from the Gaussian mixture model (GMM).
Most error correction interfaces for speech recognition applications on smartphones require the user to first mark an error region and choose the correct word from a candidate list. We propose a simple multimodal interface to make the process more efficient. We develop Long Context Match (LCM) to get candidates that complement the conventional word confusion network (WCN). Assuming that not only the preceding words but also the succeeding words of the error region are validated by users, we use such contexts to search higher-order n-grams corpora for matching word sequences. For this purpose, we also utilize the Web text data. Furthermore, we propose a combination of LCM and WCN (“LCM + WCN”) to provide users with candidate lists that are more relevant than those yielded by WCN alone. We compare our interface with the WCN-based interface on the Corpus of Spontaneous Japanese (CSJ). Our proposed “LCM + WCN” method improved the 1-best accuracy by 23%, improved the Mean Reciprocal Rank (MRR) by 28%, and our interface reduced the user's load by 12%.
This study improves the compression efficiency of Lee's colorization-based coding framework by introducing a novel colorization matrix construction and an adaptive color conversion. Colorization-based coding methods reconstruct color components in the decoder by colorization, which adds color to a base component (a grayscale image) using scant color information. The colorization process can be expressed as a linear combination of a few column vectors of a colorization matrix. Thus it is important for colorization-based coding to make a colorization matrix whose column vectors effectively approximate color components. To make a colorization matrix, Lee's colorization-based coding framework first obtains a base and color components by RGB-YCbCr color conversion, and then performs a segmentation method on the base component. Finally, the entries of a colorization matrix are created using the segmentation results. To improve compression efficiency on this framework, we construct a colorization matrix based on a correlation of base-color components. Furthermore, we embed an edge-preserving smoothing filtering process into the colorization matrix to reduce artifacts. To achieve more improvement, our method uses adaptive color conversion instead of RGB-YCbCr color conversion. Our proposed color conversion maximizes the sum of the local variance of a base component, which resulted in increment of the difference of intensities at region boundaries. Since segmentation methods partition images based on the difference, our adaptive color conversion leads to better segmentation results. Experiments showed that our method has higher compression efficiency compared with the conventional method.
Middle-level parts have attracted great attention in the computer vision community, acting as discriminative elements for objects. In this paper we propose an unsupervised approach to mine discriminative parts for object detection. This work features three aspects. First, we introduce an unsupervised, exemplar-based training process for part detection. We generate initial parts by selective search and then train part detectors by exemplar SVM. Second, a part selection model based on consistency and distinctiveness is constructed to select effective parts from the candidate pool. Third, we combine discriminative part mining with the deformable part model (DPM) for object detection. The proposed method is evaluated on the PASCAL VOC2007 and VOC2010 datasets. The experimental results demons-trate the effectiveness of our method for object detection.
In recent years, many variants of key point based image descriptors have been designed for the image matching, and they have achieved remarkable performances. However, to some images, local features appear to be inapplicable. Since theses images usually have many local changes around key points compared with a normal image, we define this special image category as the image with local changes (IL). An IL pair (ILP) refers to an image pair which contains a normal image and its IL. ILP usually loses local visual similarities between two images while still holding global visual similarity. When an IL is given as a query image, the purpose of this work is to match the corresponding ILP in a large scale image set. As a solution, we use a compressed HOG feature descriptor to extract global visual similarity. For the nearest neighbor search problem, we propose random projection indexed KD-tree forests (rKDFs) to match ILP efficiently instead of exhaustive linear search. rKDFs is built with large scale low-dimensional KD-trees. Each KD-tree is built in a random projection indexed subspace and contributes to the final result equally through a voting mechanism. We evaluated our method by a benchmark which contains 35,000 candidate images and 5,000 query images. The results show that our method is efficient for solving local-changes invariant image matching problems.
It is commonly believed that improved interaction between humans and electronic device, it is effective to draw the viewer's attention to a particular object. Augmented reality (AR) applications can call attention to real objects by overlaying highlight effects or visual stimuli (such as arrows) on a physical scene. Sometimes, more subtle effects would be desirable, in which case it would be necessary to smoothly and naturally guide the user's gaze without external stimuli. Here, a novel image modification method is proposed for directing a viewer's gaze to specific regions of interest. The proposed method uses saliency analysis and color modulation to create modified images in which the region of interest is the most salient region in the entire image. The proposed saliency map model that is used during saliency analysis reduces computational costs and improves the naturalness of the image using the LAB color space and simplified normalization. During color modulation, the modulation value of each LAB component is determined in order to consider the relationship between the LAB components and the saliency value. With the image obtained in this manner, the viewer's attention is smoothly attracted to a specific region very naturally. Gaze measurements as well as a subjective experiments were conducted to prove the effectiveness of the proposed method. These results show that a viewer's visual attention is indeed attracted toward the specified region without any sense of discomfort or disruption when the proposed method is used.
We have proposed a new Bayesian network model (BNM) framework for single-trial-EEG-based Brain-Computer Interface (BCI). The BNM was constructed in the following. In order to discriminate between left and right hands to be imaged from single-trial EEGs measured during the movement imagery tasks, the BNM has the following three steps: (1) independent component analysis (ICA) for each of the single-trial EEGs; (2) equivalent current dipole source localization (ECDL) for projections of each IC on the scalp surface; (3) BNM construction using the ECDL results. The BNMs were composed of nodes and edges which correspond to the brain sites where ECDs are located, and their connections, respectively. The connections were quantified as node activities by conditional probabilities calculated by probabilistic inference in each trial. The BNM-based BCI is compared with the common spatial pattern (CSP) method. For ten healthy subjects, there was no significant difference between the two methods. Our BNM might reflect each subject's strategy for task execution.
In a previous study, we proposed a technique to recommend candidate verbs for a method name so that developers can consistently use various verbs. In this study, we improve the rule extraction technique proposed in this previous study. Moreover, we confirm that the rank of each correct verb recommended by the new technique is higher than that by the previous technique.
Index compression is partially responsible for the current performance achievements of Internet search engines. Among many latest compression techniques, Simple9 can pack as many integers as possible into a single 32-bit machine word using 9 different padding modes. However, the number of wasted bits in Simple9 remains large. In previous works, researchers have focused on reducing the unused trailing bits of the padding modes and have proposed various additional modes that make full use of the cases of the status bits. Instead, we focus on the wasted bits in the integer list, padding extra zeros for a complete dense mode when the number of integers is not enough to fit a complete mode. More precisely, we first propose a novel index compression method called SimpleD with dense padding modes to achieve a more compact storage compared with that of Simple9. We then design an innovative metric for extracting the inserted extra zero integers during the decoding phase. Experiments on the TREC WT2G and GOV2 datasets show that our encoder outperforms Simple9 while still retaining a very fast decompression speed.
To accomplish secure communication in vehicular networks, public key infrastructure (PKI) can be employed. However, traditional PKI systems are not suitable because a unique certificate is assigned to each vehicle and thus no anonymity is guaranteed. In the combinatorial certificate schemes, each vehicle is assigned multiple certificates from a shared certificate pool and each certificate in the pool is assigned to multiple vehicles to achieve a level of anonymity. When a certificate assigned to a misbehaving vehicle is revoked, a certificate replacement procedure is executed to all vehicles sharing the certificate. To replace the revoked certificate, a randomized certificate replacement scheme probabilistically assigns different certificates to different vehicles, which can reduce collateral damage caused by repeatedly misusing a certificate and its replacement certificates. Unfortunately, previous randomized certificate replacement schemes allow unbounded collateral damage; a finite number of certificate replacements cannot detect the misbehaving vehicle with certainty. To address this problem, we propose a new randomized certificate replacement scheme with bounded collateral damage.
Considering diversified HTTP types, the performance bottleneck of signature-based classification must be resolved. We define a signature model classifying the traffic in multiple dimensions and suggest a hierarchical signature structure to remove signature redundancy and minimize search space. Our experiments on campus traffic demonstrated 1.8 times faster processing speed than the Aho-Corasick matching algorithm in Snort.
In this letter, we propose a novel kind of uncertain query, top (k1,k2) query. The x-tuple model and the possible world semantics are used to describe data objects in uncertain datasets. The top (k1,k2) query is going to find k2 x-tuples with largest probabilities to be the result of top k1 query in a possible world. Firstly, we design a basic algorithm for top (k1,k2) query based on dynamic programming. And then some pruning strategies are designed to improve its efficiency. An improved initialization method is proposed for further acceleration. Experiments in real and synthetic datasets prove the performance of our methods.
Although sparse coding has emerged as an extremely powerful tool for texture and image classification, it neglects the relationship of coding coefficients from the same class in the training stage, which may cause a decline in the classification performance. In this paper, we propose a novel coding strategy named compact sparse coding for ground-based cloud classification. We add a constraint on coding coefficients into the objective function of traditional sparse coding. In this way, coding coefficients from the same class can be forced to their mean vector, making them more compact and discriminative. Experiments demonstrate that our method achieves better performance than the state-of-the-art methods.
Template tracking has been extensively studied in Computer Vision with a wide range of applications. A general framework is to construct a parametric model to predict movement and to track the target. The difference in intensity between the pixels belonging to the current region and the pixels of the selected target allows a straightforward prediction of the region position in the current image. Traditional methods track the object based on the assumption that the relationship between the intensity difference and the region position is linear or non-linear. They will result in bad tracking performance when just one model is adopted. This paper proposes a method, called as Mixture Hyperplanes Approximation, which is based on finite mixture of generalized linear regression models to perform robust tracking. Moreover, a fast learning strategy is discussed, which improves the robustness against noise. Experiments demonstrate the performance and stability of Mixture Hyperplanes Approximation.
This paper presents a new connected component labeling algorithm. The proposed algorithm scans image lines every three lines and processes pixels three by three. When processing the current three pixels, we also utilize the information obtained before to reduce the repeated work for checking pixels in the mask. Experimental results demonstrated that our method is more efficient than the fastest conventional labeling algorithm.
Nonnegative matrix factorization (NMF) is an unsupervised technique to represent nonnegative data as linear combinations of nonnegative bases, which has shown impressive performance for source separation. However, its source separation performance degrades when one signal can also be described well with the bases for the interfering source signals. In this paper, we propose a discriminative NMF (DNMF) algorithm which exploits the reconstruction error for the interfering signals as well as the target signal based on target bases. The objective function for training the bases is constructed so as to yield high reconstruction error for the interfering source signals while guaranteeing low reconstruction error for the target source signals. Experiments show that the proposed method outperformed the standard NMF and another DNMF method in terms of both the perceptual evaluation of speech quality score and signal-to-distortion ratio in various noisy environments.
A Bayer-like White-RGB (W-RGB) color filter array (CFA) was invented for overcoming the weaknesses of commonly used RGB based Bayer CFA. In order to reproduce full-color images from the Bayer-like W-RGB CFA, a demosaicing or a CFA interpolation process which estimates missing color channels of raw mosaiced images from CFA is an essential process for single sensor digital cameras having CFA. In the case of Bayer CFA, numerous demosaicing methods which have remarkable performance were already proposed. In order to take advantage of both remarkable performance of demosaicing method for Bayer CFA and the characteristic of high-sensitive Bayer-like W-RGB CFA, a new method of transforming Bayer-like W-RGB to Bayer pattern is required. Therefore, in this letter, we present a new method of transforming Bayer-like W-RGB pattern to Bayer pattern. The proposed method mainly uses the color difference assumption between different channels which can be applied to practical consumer digital cameras.
Point spread function (PSF) estimation plays a paramount role in image deblurring processing, and traditionally it is solved by parameter estimation of a certain preassumed PSF shape model. In real life, the PSF shape is generally arbitrary and complicated, and thus it is assumed in this manuscript that a PSF may be decomposed as a weighted sum of a certain number of Gaussian kernels, with weight coefficients estimated in an alternating manner, and an l1 norm-based total variation (TVl1) algorithm is adopted to recover the latent image. Experiments show that the proposed method can achieve satisfactory performance on synthetic and realistic blurred images.
Anchor graph hashing (AGH) is a promising hashing method for nearest neighbor (NN) search. AGH realizes efficient search by generating and utilizing a small number of points that are called anchors. In this paper, we propose a method for improving AGH, which considers data distribution in a similarity space and selects suitable anchors by performing principal component analysis (PCA) in the similarity space.
Controlling fluid simulation is one of the important research topics in computer graphics. In this paper, we focus on controlling the simulation of cumuliform cloud formation. Using a previously proposed method for controlling cloud simulation the convergence speed is very slow; therefore, it takes a long time before the clouds form the desired shapes. We improved the method and accelerated the convergence by introducing a new mechanism for controlling the amount of water vapor added. We demonstrate the effectiveness of the proposed method by several examples.
Isosurface extraction is one of the most popular techniques for visualizing scalar volume data. However, volume data contains infinitely many isosurfaces. Furthermore, a single isosurface might contain many connected components, or contours, with each representing a different object surface. Hence, it is often a tedious and time-consuming manual process to find and extract contours that are interesting to users. This paper describes a novel method for automatically extracting salient contours from volume data. For this purpose, we propose a contour gradient tree (CGT) that contains the information of salient contours and their saliency magnitude. We organize the CGT in a hierarchical way to generate a sequence of contours in saliency order. Our method was applied to various medical datasets. Experimental results show that our method can automatically extract salient contours that represent regions of interest in the data.
This letter presents a method for active noise cancelation (ANC) for headphone application. The method improves the performance of ANC by deriving a flexible independent component analysis (ICA) algorithm in a hybrid structure combining feedforward and feedback configurations with correlation-based wind detection. The effectiveness of the method is demonstrated through simulation.