IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E92.D , Issue 9
Showing 1-27 articles out of 27 articles from the selected issue
Regular Section
  • Kengo TERASAWA, Yuzuru TANAKA
    Type: PAPER
    Subject area: Algorithm Theory
    2009 Volume E92.D Issue 9 Pages 1609-1619
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    This paper describes a novel algorithm for approximate nearest neighbor searching. For solving this problem especially in high dimensional spaces, one of the best-known algorithm is Locality-Sensitive Hashing (LSH). This paper presents a variant of the LSH algorithm that outperforms previously proposed methods when the dataset consists of vectors normalized to unit length, which is often the case in pattern recognition. The LSH scheme is based on a family of hash functions that preserves the locality of points. This paper points out that for our special case problem we can design efficient hash functions that map a point on the hypersphere into the closest vertex of the randomly rotated regular polytope. The computational analysis confirmed that the proposed method could improve the exponent ρ, the main indicator of the performance of the LSH algorithm. The practical experiments also supported the efficiency of our algorithm both in time and in space.
    Download PDF (400K)
  • Shuichi MIYAZAKI, Naoyuki MORIMOTO, Yasuo OKABE
    Type: PAPER
    Subject area: Algorithm Theory
    2009 Volume E92.D Issue 9 Pages 1620-1627
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    The purpose of the online graph exploration problem is to visit all the nodes of a given graph and come back to the starting node with the minimum total traverse cost. However, unlike the classical Traveling Salesperson Problem, information of the graph is given online. When an online algorithm (called a searcher) visits a node ν, then it learns information on nodes and edges adjacent to ν. The searcher must decide which node to visit next depending on partial and incomplete information of the graph that it has gained in its searching process. The goodness of the algorithm is evaluated by the competitive analysis. If input graphs to be explored are restricted to trees, the depth-first search always returns an optimal tour. However, if graphs have cycles, the problem is non-trivial. In this paper we consider two simple cases. First, we treat the problem on simple cycles. Recently, Asahiro et al. proved that there is a 1.5-competitive online algorithm, while no online algorithm can be (1.25-ε)-competitive for any positive constant ε. In this paper, we give an optimal online algorithm for this problem; namely, we give a $\frac{1+\sqrt{3}}{2}(\simeq1.366)$-competitive algorithm, and prove that there is no $(\frac{1+\sqrt{3}}{2}-\epsilon)$-competitive algorithm for any positive constant ε. Furthermore, we consider the problem on unweighted graphs. We also give an optimal result; namely we give a 2-competitive algorithm and prove that there is no (2-ε)-competitive online algorithm for any positive constant ε.
    Download PDF (349K)
  • Amir Sabbagh MOLAHOSSEINI, Chitra DADKHAH, Keivan NAVI, Mohammad ESHGH ...
    Type: PAPER
    Subject area: Computer Systems
    2009 Volume E92.D Issue 9 Pages 1628-1638
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    In this paper, the new residue number system (RNS) moduli sets {22n, 2n - 1, 2n+1 - 1} and {22n, 2n - 1, 2n-1 - 1} are introduced. These moduli sets have 4n-bit dynamic range and well-formed moduli which can result in high-performance residue to binary converters as well as efficient RNS arithmetic unit. Next, efficient residue to binary converters for the proposed moduli sets based on mixed-radix conversion (MRC) algorithm are presented. The converters are ROM-free and they are realized using carry-save adders and modulo adders. Comparison with the other residue to binary converters for 4n-bit dynamic range moduli sets shown that the presented designs based on new moduli sets {22n, 2n - 1, 2n+1 - 1} and {22n, 2n - 1, 2n-1 - 1} are improved the conversion delay and result in hardware savings. Also, the proposed moduli sets can lead to efficient binary to residue converters, and they can speed-up internal RNS arithmetic processing, compared with the other 4n-bit dynamic range moduli sets.
    Download PDF (988K)
  • Junkil RYU, Chanik PARK
    Type: PAPER
    Subject area: Computer Systems
    2009 Volume E92.D Issue 9 Pages 1639-1649
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    Silent data corruptions, which are induced by latent sector errors, phantom writes, DMA parity errors and so on, can be detected by explicitly issuing a read command to a disk controller and comparing the corresponding data with their checksums. Because some of the data stored in a storage system may not be accessed for a long time, there is a high chance of silent data corruption occurring undetected, resulting in data loss. Therefore, periodic checking of the entire data in a storage system, known as data scrubbing, is essential to detect such silent data corruptions in time. The errors detected by data scrubbing will be recovered by the replica or the redundant information maintained to protect against permanent data loss. The longer the period between data scrubbings, the higher the probability of a permanent data loss. This paper proposes a Markov failure and repair model to conservatively analyze the effect of data scrubbing on the reliability of a storage system. We show the relationship between the period of a data scrubbing operation and the number of data replicas to manage the reliability of a storage system by using the proposed model.
    Download PDF (1020K)
  • Iver STUBDAL, Arda KARADUMAN, Hideharu AMANO
    Type: PAPER
    Subject area: Fundamentals of Software and Theory of Programs
    2009 Volume E92.D Issue 9 Pages 1650-1656
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    Code density is often a critical issue in embedded computers, since the memory size of embedded systems is strictly limited. Echo instructions have been proposed as a method for reducing code size. This paper presents a new type of echo instruction, split echo, and evaluates an implementation of both split echo and traditional echo instructions on a MIPS R3000 based processor. Evaluation results show that memory requirement is reduced by 12% on average with small additional hardware cost.
    Download PDF (451K)
  • Jianfeng XU, Haruhisa KATO, Akio YONEYAMA
    Type: PAPER
    Subject area: Contents Technology and Web Information Systems
    2009 Volume E92.D Issue 9 Pages 1657-1667
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    This paper presents a content-based retrieval algorithm for motion capture data, which is required to re-use a large-scale database that has many variations in the same category of motions. The most challenging problem is that logically similar motions may not be numerically similar due to the motion variations in a category. Our algorithm can effectively retrieve logically similar motions to a query, where a distance metric between our novel short-term features is defined properly as a fundamental component in our system. We extract the features based on short-term analysis of joint velocities after dividing an entire motion capture sequence into many small overlapped clips. In each clip, we select not only the magnitude but also the dynamic pattern of the joint velocities as our features, which can discard the motion variations while keeping the significant motion information in a category. Simultaneously, the amount of data is reduced, alleviating the computational cost. Using the extracted features, we define a novel distance metric between two motion clips. By dynamic time warping, a motion dissimilarity measure is calculated between two motion capture sequences. Then, given a query, we rank all the motions in our dataset according to their motion dissimilarity measures. Our experiments, which are performed on a test dataset consisting of more than 190 motions, demonstrate that our algorithm greatly improves the performance compared to two conventional methods according to a popular evaluation measure P(NR).
    Download PDF (1034K)
  • Shu-Ling SHIEH, I-En LIAO
    Type: PAPER
    Subject area: Data Mining
    2009 Volume E92.D Issue 9 Pages 1668-1674
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    Self-Organizing Map (SOM) is a powerful tool for the exploratory of clustering methods. Clustering is the most important task in unsupervised learning and clustering validity is a major issue in cluster analysis. In this paper, a new clustering validity index is proposed to generate the clustering result of a two-level SOM. This is performed by using the separation rate of inter-cluster, the relative density of inter-cluster, and the cohesion rate of intra-cluster. The clustering validity index is proposed to find the optimal numbers of clusters and determine which two neighboring clusters can be merged in a hierarchical clustering of a two-level SOM. Experiments show that, the proposed algorithm is able to cluster data more accurately than the classical clustering algorithms which is based on a two-level SOM and is better able to find an optimal number of clusters by maximizing the clustering validity index.
    Download PDF (246K)
  • Yun GE, Guojun WANG, Qing ZHANG, Minyi GUO
    Type: PAPER
    Subject area: Networks
    2009 Volume E92.D Issue 9 Pages 1675-1682
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    We propose a Multiple Zones-based (M-Zone) routing protocol to discover node-disjoint multiplath routing efficiently and effectively in large-scale MANETs. Compared with single path routing, multipath routing can improve robustness, load balancing and throughput of a network. However, it is very difficult to achieve node-disjoint multipath routing in large-scale MANETs. To ensure finding node-disjoint multiple paths, the M-Zone protocol divides the region between a source and a destination into multiple zones based on geographical location and each path is mapped to a distinct zone. Performance analysis shows that M-Zone has good stability, and the control complexity and storage complexity of M-Zone are lower than those of the well-known AODVM protocol. Simulation studies show that the average end-to-end delay of M-Zone is lower than that of AODVM and the routing overhead of M-Zone is less than that of AODVM.
    Download PDF (379K)
  • Xiaolei ZHOU, Xiangshi REN
    Type: PAPER
    Subject area: Human-computer Interaction
    2009 Volume E92.D Issue 9 Pages 1683-1691
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    Three experiments were conducted in this study to investigate the human ability to control pen pressure and pen tilt input, by coupling this control with cursor position, angle and scale. Comparisons between pen pressure input and pen tilt input have been made in the three experiments. Experimental results show that decreasing pressure input resulted in very poor performance and was not a good input technique for any of the three experiments. In “Experiment 1-Coupling to Cursor Position”, the tilt input technique performed relatively better than the increasing pressure input technique in terms of time, even though the tilt technique had a slightly higher error rate. In “Experiment 2-Coupling to Cursor Angle”, the tilt input performed a little better than the increasing pressure input in terms of time, but the gap between them is not so apparent as Experiment 1. In “Experiment 3-Coupling to Cursor Scale”, tilt input performed a little better than increasing pressure input in terms of adjustment time. Based on the results of our experiments, we have inferred several design implications and guidelines.
    Download PDF (303K)
  • Wen-Bing HORNG, Chun-Wen CHEN
    Type: PAPER
    Subject area: Pattern Recognition
    2009 Volume E92.D Issue 9 Pages 1692-1701
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    In this paper, we present a revision of using eigenvalues of covariance matrices proposed by Tsai et al. as a measure of significance (i.e., curvature) for boundary-based corner detection. We first show the pitfall of Tsai et al.'s approach. We then further investigate the properties of eigenvalues of covariance matrices of three different types of curves and point out a mistake made by Tsai et al.'s method. Finally, we propose a modification of using eigenvalues as a measure of significance for corner detection to remedy their defect. The experiment results show that under the same conditions of the test patterns, in addition to correctly detecting all true corners, the spurious corners detected by Tsai et al.'s method disappear in our modified measure of significance.
    Download PDF (358K)
  • Yoichiro BABA, Akira HIROSE
    Type: PAPER
    Subject area: Pattern Recognition
    2009 Volume E92.D Issue 9 Pages 1702-1715
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    To obtain text information included in a scene image, we first need to extract text regions from the image before recognizing the text. In this paper, we examine human vision and propose a novel method to extract text regions by evaluating textural variation. Human beings are often attracted by textural variation in scenes, which causes foveation. We frame a hypothesis that texts also have similar property that distinguishes them from the natural background. In our method, we calculate spatial variation of texture to obtain the distribution of the degree of likelihood of text region. Here we evaluate the changes in local spatial spectrum as the textural variation. We investigate two options to evaluate the spectrum, that is, those based on one- and two-dimensional Fourier transforms. In particular, in this paper, we put emphasis on the one-dimensional transform, which functions like the Gabor filter. The proposal can be applied to a wide range of characters mainly because it employs neither templates nor heuristics concerning character size, aspect ratio, specific direction, alignment, and so on. We demonstrate that the method effectively extracts text regions contained in various general scene images. We present quantitative evaluation of the method by using databases open to the public.
    Download PDF (2086K)
  • Changliang LIU, Fuping PAN, Fengpei GE, Bin DONG, Hongbin SUO, Yonghon ...
    Type: PAPER
    Subject area: Speech and Hearing
    2009 Volume E92.D Issue 9 Pages 1716-1724
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    This paper describes a reading miscue detection system based on the conventional Large Vocabulary Continuous Speech Recognition (LVCSR) framework [1]. In order to incorporate the knowledge of reference (what the reader ought to read) and some error patterns into the decoding process, two methods are proposed: Dynamic Multiple Pronunciation Incorporation (DMPI) and Dynamic Interpolation of Language Model (DILM). DMPI dynamically adds some pronunciation variations into the search space to predict reading substitutions and insertions. To resolve the conflict between the coverage of error predications and the perplexity of the search space, only the pronunciation variants related to the reference are added. DILM dynamically interpolates the general language model based on the analysis of the reference and so keeps the active paths of decoding relatively near the reference. It makes the recognition more accurate, which further improves the detection performance. At the final stage of detection, an improved dynamic program (DP) is used to align the confusion network (CN) from speech recognition and the reference to generate the detecting result. The experimental results show that the proposed two methods can decrease the Equal Error Rate (EER) by 14% relatively, from 46.4% to 39.8%.
    Download PDF (868K)
  • Bing-Fei WU, Chuan-Tsai LIN, Yen-Lin CHEN
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2009 Volume E92.D Issue 9 Pages 1725-1735
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    This paper presents new approaches for the estimation of range between the preceding vehicle and the experimental vehicle, estimation of vehicle size and its projective size, and dynamic camera calibration. First, our proposed approaches adopt a camera model to transform coordinates from the ground plane onto the image plane to estimate the relative position between the detected vehicle and the camera. Then, to estimate the actual and projective size of the preceding vehicle, we propose a new estimation method. This method can estimate the range from a preceding vehicle to the camera based on contact points between its tires and the ground and then estimate the actual size of the vehicle according to the positions of its vertexes in the image. Because the projective size of a vehicle varies with respect to its distance to the camera, we also present a simple and rapid method of estimating a vehicle's projective height, which allows a reduction in computational time for size estimation in real-time systems. Errors caused by the application of different camera parameters are also estimated and analyzed in this study. The estimation results are used to determine suitable parameters during camera installation to suppress estimation errors. Finally, to guarantee robustness of the detection system, a new efficient approach to dynamic calibration is presented to obtain accurate camera parameters, even when they are changed by camera vibration owing to on-road driving. Experimental results demonstrate that our approaches can provide accurate and robust estimation results of range and size of target vehicles.
    Download PDF (2191K)
  • Mehdi CHEHEL AMIRANI, Ali A. BEHESHTI SHIRAZI
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2009 Volume E92.D Issue 9 Pages 1736-1744
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    In this paper, we propose a new approach to rotation invariant texture analysis. This method uses the Radon transform with some considerations in direction estimation of textural images. Furthermore, it utilizes the information obtained from the number of peaks in the variance array of the Radon transform as a realty feature. The textural features are then generated after rotation of texture along principle direction. Also, to eliminating the introduced error due to rotation of texture, a simple technique is presented. Experimental results on a set of images from the Brodatz album show a good performance achieved by the proposed method in comparison with some recent texture analysis methods.
    Download PDF (1104K)
  • Masaki YAMAZAKI, Sidney FELS
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2009 Volume E92.D Issue 9 Pages 1745-1751
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    PCA-SIFT is an extension to SIFT which aims to reduce SIFT's high dimensionality (128 dimensions) by applying PCA to the gradient image patches. However PCA is not a discriminative representation for recognition due to its global feature nature and unsupervised algorithm. In addition, linear methods such as PCA and ICA can fail in the case of non-linearity. In this paper, we propose a new discriminative method called Supervised Kernel ICA (SKICA) that uses a non-linear kernel approach combined with Supervised ICA-based local image descriptors. Our approach blends the advantages of supervised learning with nonlinear properties of kernels. Using five different test data sets we show that the SKICA descriptors produce better object recognition performance than other related approaches with the same dimensionality. The SKICA-based representation has local sensitivity, non-linear independence and high class separability providing an effective method for local image descriptors.
    Download PDF (299K)
  • Hayato KOBAYASHI, Tsugutoyo OSAKI, Tetsuro OKUYAMA, Joshua GRAMM, Akir ...
    Type: PAPER
    Subject area: Multimedia Pattern Processing
    2009 Volume E92.D Issue 9 Pages 1752-1761
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    This paper describes an interactive experimental environment for autonomous soccer robots, which is a soccer field augmented by utilizing camera input and projector output. This environment, in a sense, plays an intermediate role between simulated environments and real environments. We can simulate some parts of real environments, e.g., real objects such as robots or a ball, and reflect simulated data into the real environments, e.g., to visualize the positions on the field, so as to create a situation that allows easy debugging of robot programs. The significant point compared with analogous work is that virtual objects are touchable in this system owing to projectors. We also show the portable version of our system that does not require ceiling cameras. As an application in the augmented environment, we address the learning of goalie strategies on real quadruped robots in penalty kicks. We make our robots utilize virtual balls in order to perform only quadruped locomotion in real environments, which is quite difficult to simulate accurately. Our robots autonomously learn and acquire more beneficial strategies without human intervention in our augmented environment than those in a fully simulated environment.
    Download PDF (725K)
  • Hirofumi YAMAMOTO, Hideo OKUMA, Eiichiro SUMITA
    Type: PAPER
    Subject area: Natural Language Processing
    2009 Volume E92.D Issue 9 Pages 1762-1770
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    In the current statistical machine translation (SMT), erroneous word reordering is one of the most serious problems. To resolve this problem, many word-reordering constraint techniques have been proposed. Inversion transduction grammar (ITG) is one of these constraints. In ITG constraints, target-side word order is obtained by rotating nodes of the source-side binary tree. In these node rotations, the source binary tree instance is not considered. Therefore, stronger constraints for word reordering can be obtained by imposing further constraints derived from the source tree on the ITG constraints. For example, for the source word sequence { a b c d }, ITG constraints allow a total of twenty-two target word orderings. However, when the source binary tree instance ((a b) (c d)) is given, our proposed “imposing source tree on ITG” (IST-ITG) constraints allow only eight word orderings. The reduction in the number of word-order permutations by our proposed stronger constraints efficiently suppresses erroneous word orderings. In our experiments with IST-ITG using the NIST MT08 English-to-Chinese translation track's data, the proposed method resulted in a 1.8-points improvement in character BLEU-4 (35.2 to 37.0) and a 6.2% lower CER (74.1 to 67.9%) compared with our baseline condition.
    Download PDF (182K)
  • Ryuichiro HIGASHINAKA, Mikio NAKANO
    Type: PAPER
    Subject area: Natural Language Processing
    2009 Volume E92.D Issue 9 Pages 1771-1782
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    This paper discusses the discourse understanding process in spoken dialogue systems. This process enables a system to understand user utterances from the context of a dialogue. Ambiguity in user utterances caused by multiple speech recognition hypotheses and parsing results sometimes makes it difficult for a system to decide on a single interpretation of a user intention. As a solution, the idea of retaining possible interpretations as multiple dialogue states and resolving the ambiguity using succeeding user utterances has been proposed. Although this approach has proven to improve discourse understanding accuracy, carefully created hand-crafted rules are necessary in order to accurately rank the dialogue states. This paper proposes automatically ranking multiple dialogue states using statistical information obtained from dialogue corpora. The experimental results in the train ticket reservation and weather information service domains show that the statistical information can significantly improve the ranking accuracy of dialogue states as well as the slot accuracy and the concept error rate of the top-ranked dialogue states.
    Download PDF (434K)
  • Sang-Hyuk LEE, Keun Ho RYU, Gyoyong SOHN
    Type: LETTER
    Subject area: Computation and Computational Models
    2009 Volume E92.D Issue 9 Pages 1783-1786
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    In this study, we investigated the relationship between similarity measures and entropy for fuzzy sets. First, we developed fuzzy entropy by using the distance measure for fuzzy sets. We pointed out that the distance between the fuzzy set and the corresponding crisp set equals fuzzy entropy. We also found that the sum of the similarity measure and the entropy between the fuzzy set and the corresponding crisp set constitutes the total information in the fuzzy set. Finally, we derived a similarity measure from entropy and showed by a simple example that the maximum similarity measure can be obtained using a minimum entropy formulation.
    Download PDF (144K)
  • Hyeon-Gyu KIM, Woo-Lam KANG, Yoon-Joon LEE, Myoung-Ho KIM
    Type: LETTER
    Subject area: Database
    2009 Volume E92.D Issue 9 Pages 1787-1790
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    In this paper, we propose a predicate indexing method which handles equality and inequality tests separately. Our method uses a hash table for the equality test and a balanced binary search tree for the inequality test. Such a separate structure reduces a height of the search tree and the number of comparisons per tree node, as well as the cost for tree rebalancing. We compared our method with the IBS-tree which is one of the popular indexing methods suitable for data stream processing. Our experimental results show that the proposed method provides better insertion and search performances than the IBS-tree.
    Download PDF (527K)
  • Byounghee SON, Youngchoong PARK, Euiseok NAHM
    Type: LETTER
    Subject area: Networks
    2009 Volume E92.D Issue 9 Pages 1791-1793
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    The paper introduces both high-speed transmission and quality of system to offer the Internet services on a HFC (Hybrid Fiber Coaxial) network. This utilizes modulating the phase and the amplitude to the signal of the IPMS (Internet Protocol Multicasting Service). An IP-cable transmitter, IP-cable modem, and IP-cable management servers that support 30-Mbps IPMS on the HFC were developed. The system provides a 21Mbps HDTV transporting stream on a cable TV network. It can sustain a clear screen for a long time.
    Download PDF (609K)
  • Kihyeon KIM, Hanseok KO
    Type: LETTER
    Subject area: Speech and Hearing
    2009 Volume E92.D Issue 9 Pages 1794-1797
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    In this Letter, a robust system identification method is proposed for the generalized sidelobe canceller using dual microphones. The conventional transfer-function generalized sidelobe canceller employs the non-stationarity characteristics of the speech signal to estimate the relative transfer function and thus is difficult to apply when the noise is also non-stationary. Under the assumption of W-disjoint orthogonality between the speech and the non-stationary noise, the proposed algorithm finds the speech-dominant time-frequency bins of the input signal by inspecting the system output and the inter-microphone time delay. Only these bins are used to estimate the relative transfer function, so reliable estimates can be obtained under non-stationary noise conditions. The experimental results show that the proposed algorithm significantly improves the performance of the transfer-function generalized sidelobe canceller, while only sustaining a modest estimation error in adverse non-stationary noise environments.
    Download PDF (346K)
  • Xiang XIAO, Xiang ZHANG, Haipeng WANG, Hongbin SUO, Qingwei ZHAO, Yong ...
    Type: LETTER
    Subject area: Speech and Hearing
    2009 Volume E92.D Issue 9 Pages 1798-1802
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    The GMM-UBM framework has been proved to be one of the most effective approaches to the automatic speaker verification (ASV) task in recent years. In this letter, we first propose an approximate decision function of traditional GMM-UBM, from which it is shown that the contribution to classification of each Gaussian component is equally important. However, research in speaker perception shows that a different speech sound unit defined by Gaussian component makes a different contribution to speaker verification. This motivates us to emphasize some sound units which have discriminability between speakers while de-emphasize the speech sound units which contain little information for speaker verification. Experiments on 2006 NIST SRE core task show that the proposed approach outperforms traditional GMM-UBM approach in classification accuracy.
    Download PDF (1204K)
  • Zhen SUN, Zhe-Ming LU, Hao LUO
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2009 Volume E92.D Issue 9 Pages 1803-1806
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    This Letter proposes a new kind of features for color image retrieval based on Distance-weighted Boundary Predictive Vector Quantization (DWBPVQ) Index Histograms. For each color image in the database, 6 histograms (2 for each color component) are calculated from the six corresponding DWBPVQ index sequences. The retrieval simulation results show that, compared with the traditional Spatial-domain Color-Histogram-based (SCH) features and the DCTVQ index histogram-based (DCTVQIH) features, the proposed DWBPVQIH features can greatly improve the recall and precision performance.
    Download PDF (139K)
  • Chang Sik SON, Suk Tae SEO, In Keun LEE, Hye Cheun JEONG, Soon Hak KWO ...
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2009 Volume E92.D Issue 9 Pages 1807-1810
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    We propose a thresholding method based on interval-valued fuzzy sets which are used to define the grade of a gray level belonging to one of the two classes, an object and the background of an image. The effectiveness of the proposed method is demonstrated by comparing our classification results on eight test images to results from the conventional methods.
    Download PDF (426K)
  • Tang YINGJUN, Xu DE, Yang XU, Liu QIFANG
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2009 Volume E92.D Issue 9 Pages 1811-1814
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    We present a novel model named Integrated Latent Topic Model (ILTM), to learn and recognize natural scene category. Unlike previous work, which considered the discrepancy and common property separately among all categories, Our approach combines universal topics from all categories with specific topics from each category. As a result, the model is implemented to produce a few but specific topics and more generic topics among categories, and each category is represented in a different topics simplex, which correlates well with human scene understanding. We investigate the classification performance with variable scene category tasks. The experiments have shown our model outperforms latent-space methods with less training data.
    Download PDF (376K)
  • Li LU, Pengfei SHI
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2009 Volume E92.D Issue 9 Pages 1815-1818
    Published: September 01, 2009
    Released: September 01, 2009
    JOURNALS FREE ACCESS
    A novel age estimation method is presented which improves performance by fusing complementary information acquired from global and local features of the face. Two-directional two-dimensional principal component analysis ((2D)2PCA) is used for dimensionality reduction and construction of individual feature spaces. Each feature space contributes a confidence value which is calculated by Support vector machines (SVMs). The confidence values of all the facial features are then fused for final age estimation. Experimental results demonstrate that fusing multiple facial features can achieve significant accuracy gains over any single feature. Finally, we propose a fusion method that further improves accuracy.
    Download PDF (240K)
feedback
Top