IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E96.D , Issue 11
Showing 1-26 articles out of 26 articles from the selected issue
Regular Section
  • Pooia LALBAKHSH, Bahram ZAERI, Ali LALBAKHSH
    Type: PAPER
    Subject area: Fundamentals of Information Systems
    2013 Volume E96.D Issue 11 Pages 2309-2318
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    The paper introduces a novel pheromone update strategy to improve the functionality of ant colony optimization algorithms. This modification tries to extend the search area by an optimistic reinforcement strategy in which not only the most desirable sub-solution is reinforced in each step, but some of the other partial solutions with acceptable levels of optimality are also favored. therefore, it improves the desire for the other potential solutions to be selected by the following artificial ants towards a more exhaustive algorithm by increasing the overall exploration. The modifications can be adopted in all ant-based optimization algorithms; however, this paper focuses on two static problems of travelling salesman problem and classification rule mining. To work on these challenging problems we considered two ACO algorithms of ACS (Ant Colony System) and AntMiner 3.0 and modified their pheromone update strategy. As shown by simulation experiments, the novel pheromone update method can improve the behavior of both algorithms regarding almost all the performance evaluation metrics.
    Download PDF (300K)
  • Jinwei WANG, Xirong MA, Yuanping ZHU, Jizhou SUN
    Type: PAPER
    Subject area: Fundamentals of Information Systems
    2013 Volume E96.D Issue 11 Pages 2319-2326
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Modern GPUs have evolved to become a more general processor capable of executing scientific and engineering computations. It provides a highly parallel computing environment due to its large number of computing cores, which are suitable for numerous data parallel arithmetic computations, particularly linear algebra operations. The matrix-vector multiplication is one of the most important dense linear algebraic operations. It is applied to a diverse set of applications in many fields and must therefore be fully optimized to achieve a high-performance. In this paper, we proposed a novel auto-tuning method for matrix-vector multiplication on GPUs, where the number of assigned threads that are used to compute one element of the result vector can be auto-tuned according to the size of matrix. On the Nvidia's GPU GTX 650 with the most recent Kepler architecture, we developed an auto-tuner that can automatically select the optimal number of assigned threads for calculation. Based on the auto-tuner's result, we developed a versatile generic matrix-vector multiplication kernel with the CUDA programming model. A series of experiments on different shapes and sizes of matrices were conducted for comparing the performance of our kernel with that of the kernels from CUBLAS 5.0, MAGMA 1.3 and a warp method. The experiments results show that the performance of our matrix-vector multiplication kernel is close to the optimal behavior with increasing of the size of the matrix and has very little dependency on the shape of the matrix, which is a significant improvement compared to the other three kernels that exhibit unstable performance behavior for different shapes of matrices.
    Download PDF (1410K)
  • Asahi TAKAOKA, Satoshi TAYU, Shuichi UENO
    Type: PAPER
    Subject area: Fundamentals of Information Systems
    2013 Volume E96.D Issue 11 Pages 2327-2332
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    We consider the minimum feedback vertex set problem for some bipartite graphs and degree-constrained graphs. We show that the problem is linear time solvable for bipartite permutation graphs and NP-hard for grid intersection graphs. We also show that the problem is solvable in O(n2log 6n) time for n-vertex graphs with maximum degree at most three.
    Download PDF (363K)
  • Yohei HORI, Toshihiro KATASHITA, Hirofumi SAKANE, Kenji TODA, Akashi S ...
    Type: PAPER
    Subject area: Computer System
    2013 Volume E96.D Issue 11 Pages 2333-2343
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Protecting the confidentiality and integrity of a configuration bitstream is essential for the dynamic partial reconfiguration (DPR) of field-programmable gate arrays (FPGAs). This is because erroneous or falsified bitstreams can cause fatal damage to FPGAs. In this paper, we present a high-speed and area-efficient bitstream protection scheme for DPR systems using the Advanced Encryption Standard with Galois/Counter Mode (AES-GCM), which is an authenticated encryption algorithm. Unlike many previous studies, our bitstream protection scheme also provides a mechanism for error recovery and tamper resistance against configuration block deletion, insertion, and disorder. The implementation and evaluation results show that our DPR scheme achieves a higher performance, in terms of speed and area, than previous methods.
    Download PDF (1133K)
  • Chuanyi LIU, Jie LIN, Binxing FANG
    Type: PAPER
    Subject area: Computer System
    2013 Volume E96.D Issue 11 Pages 2344-2353
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Cloud computing is broadly recognized as as the prevalent trend in IT. However, in cloud computing mode, customers lose the direct control of their data and applications hosted by the cloud providers, which leads to the trustworthiness issue of the cloud providers, hindering the widespread use of cloud computing. This paper proposes a trustworthiness verification and audit mechanism on cloud providers called T-YUN. It introduces a trusted third party to cyclically attest the remote clouds, which are instrumented with the trusted chain covering the whole architecture stack. According to the main operations of the clouds, remote verification protocols are also proposed in T-YUN, with a dedicated key management scheme. This paper also implements a proof-of-concept emulator to validate the effectiveness and performance overhead of T-YUN. The experimental results show that T-YUN is effective and the extra overhead incurred by it is acceptable.
    Download PDF (1534K)
  • Jinghua YAN, Xiaochun YUN, Hao LUO, Zhigang WU, Shuzhuang ZHANG
    Type: PAPER
    Subject area: Information Network
    2013 Volume E96.D Issue 11 Pages 2354-2364
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Traffic classification has recently gained much attention in both academic and industrial research communities. Many machine learning methods have been proposed to tackle this problem and have shown good results. However, when applied to traffic with out-of-sequence packets, the accuracy of existing machine learning approaches decreases dramatically. We observe the main reason is that the out-of-sequence packets change the spatial representation of feature vectors, which means the property of linear mapping relation among features used in machine learning approaches cannot hold any more. To address this problem, this paper proposes an Improved Dynamic Time Warping (IDTW) method, which can align two feature vectors using non-linear alignment. Experimental results on two real traces show that IDTW achieves better classification accuracy in out-of-sequence traffic classification, in comparison to existing machine learning approaches.
    Download PDF (892K)
  • Xiuyan JIANG, Dejian YE, Yiming CHEN, Xuejun TIAN
    Type: PAPER
    Subject area: Information Network
    2013 Volume E96.D Issue 11 Pages 2365-2375
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Smart TVs are expected to play a leading role in the future networked intelligent screen market. Currently, many operators are planning to deploy it in large scale in a few years. Therefore, it is necessary for smart TVs to provide high quality services for users. Packet loss is one critical reason that decreases the QoS in smart TVs. Even a very small amount of packet loss (1-2%) can decrease the QoS and affect users' experience seriously. This paper applies stochastic differential equations to analyzing the queue in the buffer of access points in smart TV multicast systems, demonstrates the reason for packet loss, and then proposes an end-to-end error recovery scheme (short as OPRSFEC) whose core algorithm is based on Reed-Solomon theory, and optimizes four aspects in finite fields: 1) Using Cauchy matrix instead of Vandermonde matrix to code and decode; 2) generating inverse matrix by table look-up; 3) changing the matrix multiplication into the table look-up; 4) originally dividing the matrix multiplication. This paper implements the scheme on the application layer, which screens the heterogeneity of terminals and servers, corrects 100% packet loss (loss rate is 1%-2%) in multicast systems, and brings very little effect on real-time users experience. Simulations demonstrate that the proposed scheme has good performances, successfully runs on Sigma and Mstar Moca TV terminals, and increases the QoS of smart TVs. Recently, OPRSFEC middleware has become a part of IPTV2.0 standard in Shanghai Telecom and has been running on the Mstar boards of Haier Moca TVs properly.
    Download PDF (1264K)
  • Dung Duc NGUYEN, Maike ERDMANN, Tomoya TAKEYOSHI, Gen HATTORI, Kazunor ...
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2013 Volume E96.D Issue 11 Pages 2376-2384
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    The abundance of information published on the Internet makes filtering of hazardous Web pages a difficult yet important task. Supervised learning methods such as Support Vector Machines (SVMs) can be used to identify hazardous Web content. However, scalability is a big challenge, especially if we have to train multiple classifiers, since different policies exist on what kind of information is hazardous. We therefore propose two different strategies to train multiple SVMs for personalized Web content filters. The first strategy identifies common data clusters and then performs optimization on these clusters in order to obtain good initial solutions for individual problems. This initialization shortens the path to the optimal solutions and reduces the training time on individual training sets. The second approach is to train all SVMs simultaneously. We introduce an SMO-based kernel-biased heuristic that balances the reduction rate of individual objective functions and the computational cost of kernel matrix. The heuristic primarily relies on the optimality conditions of all optimization problems and secondly on the pre-calculated part of the whole kernel matrix. This strategy increases the amount of information sharing among learning tasks, thus reduces the number of kernel calculation and training time. In our experiments on inconsistently labeled training examples, both strategies were able to predict hazardous Web pages accurately (> 91%) with a training time of only 26% and 18% compared to that of the normal sequential training.
    Download PDF (623K)
  • Kai SHI, Yuichi GOTO, Zhiliang ZHU, Jingde CHENG
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2013 Volume E96.D Issue 11 Pages 2385-2396
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Avoiding runway incursions is a significant challenge and a top priority in aviation. Due to all causes of runway incursions belong to human factors, runway incursion prevention systems should remove human from the system operation loop as much as possible. Although current runway incursion prevention systems have made big progress on how to obtain accurate and sufficient information of aircraft/vehicles, they cannot predict and detect runway incursions as early as experienced air traffic controllers by using the same surveillance information, and cannot give explicit instructions and/or suggestions to prevent runway incursions like real air traffic controllers either. In one word, human still plays an important position in current runway incursion prevention systems. In order to remove human factors from the system operation loop as much as possible, this paper proposes a new type of runway incursion prevention system based on logic-based reasoning. The system predicts and detects runway incursions, then gives explicit instructions and/or suggestions to pilots/drivers to avoid runway incursions/collisions. The features of the system include long-range prediction of incidents, explicit instructions and/or suggestions, and flexible model for different policies and airports. To evaluate our system, we built a simulation system, and evaluated our system using both real historical scenarios and conventional fictional scenarios. The evaluation showed that our system is effective at providing earlier prediction of incidents than current systems, giving explicit instructions and/or suggestions for handling the incidents effectively, and customizing for specific policies and airports using flexible model.
    Download PDF (1060K)
  • Warin WATTANAPORNPROM, Prabhas CHONGSTITVATANA
    Type: PAPER
    Subject area: Artificial Intelligence, Data Mining
    2013 Volume E96.D Issue 11 Pages 2397-2408
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    This article introduces the Coincidence Algorithm (COIN) to solve several multimodal puzzles. COIN is an algorithm in the category of Estimation of Distribution Algorithms (EDAs) that makes use of probabilistic models to generate solutions. The model of COIN is a joint probability table of adjacent events (coincidence) derived from the population of candidate solutions. A unique characteristic of COIN is the ability to learn from a negative sample. Various experiments show that learning from a negative example helps to prevent premature convergence, promotes diversity and preserves good building blocks.
    Download PDF (1642K)
  • Yong-Soo SEOL, Han-Woo KIM
    Type: PAPER
    Subject area: Human-computer Interaction
    2013 Volume E96.D Issue 11 Pages 2409-2416
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    To understand human emotion, it is necessary to be aware of the surrounding situation and individual personalities. In most previous studies, however, these important aspects were not considered. Emotion recognition has been considered as a classification problem. In this paper, we attempt new approaches to utilize a person's situational information and personality for use in understanding emotion. We propose a method of extracting situational information and building a personalized emotion model for reflecting the personality of each character in the text. To extract and utilize situational information, we propose a situation model using lexical and syntactic information. In addition, to reflect the personality of an individual, we propose a personalized emotion model using KBANN (Knowledge-based Artificial Neural Network). Our proposed system has the advantage of using a traditional keyword-spotting algorithm. In addition, we also reflect the fact that the strength of emotion decreases over time. Experimental results show that the proposed system can more accurately and intelligently recognize a person's emotion than previous methods.
    Download PDF (572K)
  • Trung-Nghia PHUNG, Thanh-Son PHAN, Thang Tat VU, Mai Chi LUONG, Masato ...
    Type: PAPER
    Subject area: Speech and Hearing
    2013 Volume E96.D Issue 11 Pages 2417-2426
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    The most important advantage of HMM-based TTS is its highly intelligible. However, speech synthesized by HMM-based TTS is muffled and far from natural, especially under limited data conditions, which is mainly caused by its over-smoothness. Therefore, the motivation for this paper is to improve the naturalness of HMM-based TTS trained under limited data conditions while preserving its intelligibility. To achieve this motivation, a hybrid TTS between HMM-based TTS and the modified restricted Temporal Decomposition (MRTD), named HTD in this paper, was proposed. Here, TD is an interpolation model of decomposing a spectral or prosodic sequence of speech into sparse event targets and dynamic event functions, and MRTD is one simplified version of TD. With a determination of event functions close to the concept of co-articulation in speech, MRTD can synthesize smooth speech and the smoothness in synthesized speech can be adjusted by manipulating event targets of MRTD. Previous studies have also found that event functions of MRTD can represent linguistic information of speech, which is important to perceive speech intelligibility, while sparse event targets can convey the non-linguistics information, which is important to perceive the naturalness of speech. Therefore, prosodic trajectories and MRTD event functions of the spectral trajectory generated by HMM-based TTS were kept unchanged to preserve the high and stable intelligibility of HMM-based TTS. Whereas MRTD event targets of the spectral trajectory generated by HMM-based TTS were rendered with an original speech database to enhance the naturalness of synthesized speech. Experimental results with small Vietnamese datasets revealed that the proposed HTD was equivalent to HMM-based TTS in terms of intelligibility but was superior to it in terms of naturalness. Further discussions show that HTD had a small footprint. Therefore, the proposed HTD showed its strong efficiency under limited data conditions.
    Download PDF (1310K)
  • Kazu MISHIBA, Masaaki IKEHARA, Takeshi YOSHITOME
    Type: PAPER
    Subject area: Image Processing and Video Processing
    2013 Volume E96.D Issue 11 Pages 2427-2436
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    In this paper, we propose a novel content-aware image resizing method based on grid transformation. Our method focuses on not only keeping important regions unchanged but also keeping the aspect ratio of the main object in an image unchanged. The dual conditions can avoid distortion which often occurs when only using the former condition. Our method first calculates image importance. Next, we extract the main objects on an image by using image importance. Finally, we calculate the optimal grid transformation which suppresses changes in size of important regions and in the aspect ratios of the main objects. Our method uses lower and upper thresholds for transformation to suppress distortion due to extreme shrinking and enlargement. To achieve better resizing results, we introduce a boundary discarding process. This process can assign wider regions to important regions, reducing distortions on important regions. Experimental results demonstrate that our proposed method resizes images with less distortion than other resizing methods.
    Download PDF (5220K)
  • Fuji REN, Bo LI, Qimei CHEN
    Type: PAPER
    Subject area: Image Processing and Video Processing
    2013 Volume E96.D Issue 11 Pages 2437-2449
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Considering the non-linear properties of the human visual system, many non-linear operators and models have been developed, particularly the logarithmic image processing (LIP) model proposed by Jourlin and Pinoli, which has been proved to be physically justified in several laws of the human visual system and has been successfully applied in image processing areas. Recently, several modifications based on this logarithmic mathematical framework have been presented, such as parameterized logarithmic image processing (PLIP), pseudo-logarithmic image processing, homomorphic logarithmic image processing. In this paper, a new single parameter logarithmic model for image processing with an adaptive parameter-based Sobel edge detection algorithm is presented. On the basis of analyzing the distributive law, the subtractive law, and the isomorphic property of the PLIP model, the five parameters in PLIP are replaced by a single parameter to ensure the completeness of the model and physical constancy with the nature of an image, and then an adaptive parameter-based Sobel edge detection algorithm is proposed. By using an image noise estimation method to evaluate the noise level of image, the adaptive parameter in the single parameter LIP model is calculated based on the noise level and grayscale value of a corresponding image area, followed by the single-parameter LIP-based Sobel operation to overcome the noise-sensitive problem of classical LIP-based Sobel edge detection methods, especially in the dark area of an image, while retaining edge sensitivity. Compared with the classical LIP and PLIP model, the given single parameter LIP achieves satisfactory results in noise suppression and edge accuracy.
    Download PDF (3142K)
  • Wei LI, Yang WU, Masayuki MUKUNOKI, Michihiko MINOH
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2013 Volume E96.D Issue 11 Pages 2450-2461
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Multiple-shot person re-identification, which is valuable for application in visual surveillance, tackles the problem of building the correspondence between images of the same person from different cameras. It is challenging because of the large within-class variations due to the changeable body appearance and environment and the small between-class differences arising from the possibly similar body shape and clothes style. A novel method named “Bi-level Relative Information Analysis” is proposed in this paper for the issue by treating it as a set-based ranking problem. It creatively designs a relative dissimilarity using set-level neighborhood information, called “Set-level Common-Near-Neighbor Modeling”, complementary to the sample-level relative feature “Third-Party Collaborative Representation” which has recently been proven to be quite effective for multiple-shot person re-identification. Experiments implemented on several public benchmark datasets show significant improvements over state-of-the-art methods.
    Download PDF (3059K)
  • Chuzo IWAMOTO, Yuta MATSUI
    Type: LETTER
    Subject area: Fundamentals of Information Systems
    2013 Volume E96.D Issue 11 Pages 2462-2465
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Pyramid is a solitaire game, where the object is to remove all cards from both a pyramidal layout and a stock of cards. Two exposed cards can be matched and removed if their values total 13. Any exposed card of value 13 and the top card of the stock can be discarded immediately. We prove that the generalized version of Pyramid is NP-complete.
    Download PDF (382K)
  • Zhong ZHENG, Zhiying WANG, Li SHEN
    Type: LETTER
    Subject area: Computer System
    2013 Volume E96.D Issue 11 Pages 2466-2469
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Power consumption has become a critical factor for embedded systems, especially for battery powered ones. Caches in these systems consume a large portion of the whole chip power. Embedded systems usually adopt set-associative caches to get better performance. However, parallel accessed cache ways incur more energy dissipation. This paper proposed a region-based way-partitioning scheme to reduce cache way access, and without sacrificing performance, to reduce the cache power consumption. The stack accesses and non-stack accesses are isolated and redirected to different ways of the L1 data cache. Under way-partitioning, cache way accesses are reduced, as well as the memory reference interference. Experimental results show that the proposed approach could save around 27.5% of L1 data cache energy on average, without significant performance degradation.
    Download PDF (520K)
  • Takashi ISHIO, Hiroki WAKISAKA, Yuki MANABE, Katsuro INOUE
    Type: LETTER
    Subject area: Software System
    2013 Volume E96.D Issue 11 Pages 2470-2472
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Logging the execution process of a program is a popular activity for practical program understanding. However, understanding the behavior of a program from a complete execution trace is difficult because a system may generate a substantial number of runtime events. To focus on a small subset of runtime events, a dynamic object process graph (DOPG) has been proposed. Although a DOPG can potentially facilitate program understanding, the logging process has not been adapted for DOPGs. If a developer is interested in the behavior of a particular object, only the runtime events related to the object are necessary to construct a DOPG. The vast majority of runtime events in a complete execution trace are irrelevant to the interesting object. This paper analyzes actual DOPGs and reports that a logging tool can be optimized to record only the runtime events related to a particular object specified by a developer.
    Download PDF (201K)
  • Jinho AHN
    Type: LETTER
    Subject area: Dependable Computing
    2013 Volume E96.D Issue 11 Pages 2473-2477
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    This paper presents a new scalable method to considerably reduce the rollback propagation effect of the conventional optimistic message logging by utilizing positive features of reliable FIFO group communication links. To satisfy this goal, the proposed method forces group members to replicate different receive sequence numbers (RSNs), which they assigned for each identical message to their group respectively, into their volatile memories. As the degree of redundancy of RSNs increases, the possibility of local recovery for each crashed process may significantly be higher. Experimental results show that our method can outperform the previous one in terms of the rollback distance of non-faulty processes with a little normal time overhead.
    Download PDF (2175K)
  • Xin LI, Jielin PAN, Qingwei ZHAO, Yonghong YAN
    Type: LETTER
    Subject area: Speech and Hearing
    2013 Volume E96.D Issue 11 Pages 2478-2482
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Morphemes, which are obtained from morphological parsing, and statistical sub-words, which are derived from data-driven splitting, are commonly used as the recognition units for speech recognition of agglutinative languages. In this letter, we propose a discriminative approach to select the splitting result, which is more likely to improve the recognizer's performance, for each distinct word type. An objective function which involves the unigram language model (LM) probability and the count of misrecognized phones on the acoustic training data is defined and minimized. After determining the splitting result for each word in the text corpus, we select the frequent units to build a hybrid vocabulary including morphemes and statistical sub-words. Compared to a statistical sub-word based system, the hybrid system achieves 0.8% letter error rates (LERs) reduction on the test set.
    Download PDF (569K)
  • Jangwon CHOI, Yoonsik CHOE, Yong-Goo KIM
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2013 Volume E96.D Issue 11 Pages 2483-2486
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    This letter proposes a novel depth-guided inpainting scheme for the high quality hole-filling in 2D-to-3D video conversion. The proposed scheme detects and removes foreground depth layers in an image patch, enabling appropriate patch formation using only disoccluded background information. This background only patch formation helps to avoid the propagation of wrong depths over hole area, and thus improve the overall quality of converted 3D video experience. Experimental results demonstrate the proposed scheme provides visually much more pleasing inpainting results with better preserved object edges compared to the state-of-the-art depth-guided inpainting schemes.
    Download PDF (3710K)
  • Jin-Ping HE, Kun GAO, Guo-Qiang NI, Guang-Da SU, Jian-Sheng CHEN
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2013 Volume E96.D Issue 11 Pages 2487-2491
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Considering the real existent fact of the ideal edge and the learning style of image analogy without reference parameters, a blind image recovery algorithm using a self-adaptive learning method is proposed in this paper. We show that a specific local image patch with degradation characteristic can be utilized for restoring the whole image. In the training process, a clear counterpart of the local image patch is constructed based on the ideal edge assumption so that identification of the Point Spread Function is no longer needed. Experiments demonstrate the effectiveness of the proposed method on remote sensing images.
    Download PDF (506K)
  • Sungchan OH, Hyug-Jae LEE, Gyeonghwan KIM
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2013 Volume E96.D Issue 11 Pages 2492-2495
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    This letter presents a method of adding a virtual halo effect to an object of interest in video sequences. A modified graph-cut segmentation algorithm extracts object layers. The halo is modeled by the accumulation of gradually changing Gaussians. With a synthesized blooming effect, the experimental results show that the proposed method conveys realistic halo effect.
    Download PDF (660K)
  • Kun LU, Xin ZHANG
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2013 Volume E96.D Issue 11 Pages 2496-2499
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    This letter presents a novel approach for automatic multimodal affect recognition. The audio and visual channels provide complementary information for human affective states recognition, and we utilize Boltzmann zippers as model-level fusion to learn intrinsic correlations between the different modalities. We extract effective audio and visual feature streams with different time scales and feed them to two component Boltzmann chains respectively. Hidden units of the two chains are interconnected to form a Boltzmann zipper which can effectively avoid local energy minima during training. Second-order methods are applied to Boltzmann zippers to speed up learning and pruning process. Experimental results on audio-visual emotion data recorded by ourselves in Wizard of Oz scenarios and collected from the SEMAINE naturalistic database both demonstrate our approach is robust and outperforms the state-of-the-art methods.
    Download PDF (572K)
  • Guoqi LIU, Zhiheng ZHOU, Shengli XIE, Dongcheng WU
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2013 Volume E96.D Issue 11 Pages 2500-2503
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    Vector field convolution (VFC) provides a successful external force for an active contour model. However, it fails to extract the complex geometries, especially the deep concavity when the initial contour is set outside the object or the concave region. In this letter, dynamically constrained vector field convolution (DCVFC) external force is proposed to solve this problem. In DCVFC, the indicator function with respect to the evolving contour is introduced to restrain the correlation of external forces generated by different edges, and the forces dynamically generated by complex concave edges gradually make the contour move to the object. On the other hand, traditional vector field, a component of the proposed DCVFC, makes the evolving contour stop at the object boundary. The connections between VFC and DCVFC are also analyzed. DCVFC maintains desirable properties of VFC, such as robustness to initialization. Experimental results demonstrate that DCVFC snake provides a much better segmentation than VFC snake.
    Download PDF (503K)
  • Jea-Yul YOON, Chai-Jong SONG, Hochong PARK
    Type: LETTER
    Subject area: Music Information Processing
    2013 Volume E96.D Issue 11 Pages 2504-2507
    Published: November 01, 2013
    Released: November 01, 2013
    JOURNALS FREE ACCESS
    A new method for predominant melody extraction from polyphonic music signals based on harmonic structure is proposed. The proposed method first extracts a set of fundamental frequency candidates by analyzing the distance between spectral peaks. Then, the predominant fundamental frequency is selected by pitch tracking according to the harmonic strength of the selected candidates. Finally, the method runs pitch smoothing on a large temporal scale for eliminating pitch doubling error, and conducts voicing frame detection. The proposed method shows the best overall performance for ADC 2004 DB in the MIREX 2011 audio melody extraction task.
    Download PDF (497K)
feedback
Top