IEICE Transactions on Information and Systems

Regular Section

The State-of-the-Art in Handling Occlusions for Visual Object Tracking

Kourosh MESHGI, Shin ISHII

Article type: SURVEY PAPER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 7 Pages 1260-1274
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDR0002

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper reports on the trending literature of occlusion handling in the task of online visual tracking. The discussion first explores visual tracking realm and pinpoints the necessity of dedicated attention to the occlusion problem. The findings suggest that although occlusion detection facilitated tracking impressively, it has been largely ignored. The literature further showed that the mainstream of the research is gathered around human tracking and crowd analysis. This is followed by a novel taxonomy of types of occlusion and challenges arising from it, during and after the emergence of an occlusion. The discussion then focuses on an investigation of the approaches to handle the occlusion in the frame-by-frame basis. Literature analysis reveals that researchers examined every aspect of a tracker design that is hypothesized as beneficial in the robust tracking under occlusion. State-of-the-art solutions identified in the literature involved various camera settings, simplifying assumptions, appearance and motion models, target state representations and observation models. The identified clusters are then analyzed and discussed, and their merits and demerits are explained. Finally, areas of potential for future research are presented.

View full abstract

Download PDF (2191K)
System Status Aware Hadoop Scheduling Methods for Job Performance Improvement

Masatoshi KAWARASAKI, Hyuma WATANABE

Article type: PAPER
Subject area: Fundamentals of Information Systems
2015Volume E98.DIssue 7 Pages 1275-1285
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7385

JOURNAL FREE ACCESS

Show abstractHide abstract

MapReduce and its open software implementation Hadoop are now widely deployed for big data analysis. As MapReduce runs over a cluster of massive machines, data transfer often becomes a bottleneck in job processing. In this paper, we explore the influence of data transfer to job processing performance and analyze the mechanism of job performance deterioration caused by data transfer oriented congestion at disk I/O and/or network I/O. Based on this analysis, we update Hadoop's Heartbeat messages to contain the real time system status for each machine, like disk I/O and link usage rate. This enhancement makes Hadoop's scheduler be aware of each machine's workload and make more accurate decision of scheduling. The experiment has been done to evaluate the effectiveness of enhanced scheduling methods and discussions are provided to compare the several proposed scheduling policies.

View full abstract

Download PDF (4021K)
Effect Analysis of Coding Convention Violations on Readability of Post-Delivered Code

Taek LEE, Jung-Been LEE, Hoh Peter IN

Article type: PAPER
Subject area: Software Engineering
2015Volume E98.DIssue 7 Pages 1286-1296
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7327

JOURNAL FREE ACCESS

Show abstractHide abstract

Adherence to coding conventions during the code production stage of software development is essential. Benefits include enabling programmers to quickly understand the context of shared code, communicate with one another in a consistent manner, and easily maintain the source code at low costs. In reality, however, programmers tend to doubt or ignore the degree to which the quality of their code is affected by adherence to these guidelines. This paper addresses research questions such as “Do violations of coding conventions affect the readability of the produced code?”, “What kinds of coding violations reduce code readability?”, and “How much do variable factors such as developer experience, project size, team size, and project maturity influence coding violations?” To respond to these research questions, we explored 210 open-source Java projects with 117 coding conventions from the Sun standard checklist. We believe our findings and the analysis approach used in the paper will encourage programmers and QA managers to develop their own customized and effective coding style guidelines.

View full abstract

Download PDF (1070K)
Model-Based Contract Testing of Graphical User Interfaces

Tugkan TUGLULAR, Arda MUFTUOGLU, Fevzi BELLI, Michael LINSCHULTE

Article type: PAPER
Subject area: Software Engineering
2015Volume E98.DIssue 7 Pages 1297-1305
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7364

JOURNAL FREE ACCESS

Show abstractHide abstract

Graphical User Interfaces (GUIs) are critical for the security, safety and reliability of software systems. Injection attacks, for instance via SQL, succeed due to insufficient input validation and can be avoided if contract-based approaches, such as Design by Contract, are followed in the software development lifecycle of GUIs. This paper proposes a model-based testing approach for detecting GUI data contract violations, which may result in serious failures such as system crash. A contract-based model of GUI data specifications is used to develop test scenarios and to serve as test oracle. The technique introduced uses multi terminal binary decision diagrams, which are designed as an integral part of decision table-augmented event sequence graphs, to implement a GUI testing process. A case study, which validates the presented approach on a port scanner written in Java programming language, is presented.

View full abstract

Download PDF (904K)
Method of Spread Spectrum Watermarking Using Quantization Index Modulation for Cropped Images

Takahiro YAMAMOTO, Masaki KAWAMURA

Article type: PAPER
Subject area: Data Engineering, Web Information Systems
2015Volume E98.DIssue 7 Pages 1306-1315
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7390

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a method of spread spectrum digital watermarking with quantization index modulation (QIM) and evaluate the method on the basis of IHC evaluation criteria. The spread spectrum technique can make watermarks robust by using spread codes. Since watermarks can have redundancy, messages can be decoded from a degraded stego-image. Under IHC evaluation criteria, it is necessary to decode the messages without the original image. To do so, we propose a method in which watermarks are generated by using the spread spectrum technique and are embedded by QIM. QIM is an embedding method that can decode without an original image. The IHC evaluation criteria include JPEG compression and cropping as attacks. JPEG compression is lossy compression. Therefore, errors occur in watermarks. Since watermarks in stego-images are out of synchronization due to cropping, the position of embedded watermarks may be unclear. Detecting this position is needed while decoding. Therefore, both error correction and synchronization are required for digital watermarking methods. As countermeasures against cropping, the original image is divided into segments to embed watermarks. Moreover, each segment is divided into 8×8 pixel blocks. A watermark is embedded into a DCT coefficient in a block by QIM. To synchronize in decoding, the proposed method uses the correlation between watermarks and spread codes. After synchronization, watermarks are extracted by QIM, and then, messages are estimated from the watermarks. The proposed method was evaluated on the basis of the IHC evaluation criteria. The PSNR had to be higher than 30 dB. Ten 1920×1080 rectangular regions were cropped from each stego-image, and 200-bit messages were decoded from these regions. Their BERs were calculated to assess the tolerance. As a result, the BERs were less than 1.0%, and the average PSNR was 46.70 dB. Therefore, our method achieved a high image quality when using the IHC evaluation criteria. In addition, the proposed method was also evaluated by using StirMark 4.0. As a result, we found that our method has robustness for not only JPEG compression and cropping but also additional noise and Gaussian filtering. Moreover, the method has an advantage in that detection time is small since the synchronization is processed in 8×8 pixel blocks.

View full abstract

Download PDF (637K)
An Evaluation on Two-Handed and One-Handed Control Methods for Positioning Object in Immersive Virtual Environments

Noritaka OSAWA, Kikuo ASAI

Article type: PAPER
Subject area: Human-computer Interaction
2015Volume E98.DIssue 7 Pages 1316-1324
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7336

JOURNAL FREE ACCESS

Show abstractHide abstract

A two-handed distance control method is proposed for precisely and efficiently manipulating a virtual 3D object by hand in an immersive virtual reality environment. The proposed method enhances direct manipulation by hand and is used to precisely control and efficiently adjust the position of an object and the viewpoint using the distance between the two hands. The two-handed method is evaluated and compared with the previously proposed one-handed speed control method, which adjusts the position of an object in accordance with the speed of one hand. The results from experimental evaluation show that two-handed methods, which make position and viewpoint adjustments, are the best among six combinations of control and adjustment methods.

View full abstract

Download PDF (1539K)
Objective No-Reference Video Quality Assessment Method Based on Spatio-Temporal Pixel Analysis

Wyllian B. da SILVA, Keiko V. O. FONSECA, Alexandre de A. P. POHL

Article type: PAPER
Subject area: Image Processing and Video Processing
2015Volume E98.DIssue 7 Pages 1325-1332
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7155

JOURNAL FREE ACCESS

Show abstractHide abstract

Digital video signals are subject to several distortions due to compression processes, transmission over noisy channels or video processing. Therefore, the video quality evaluation has become a necessity for broadcasters and content providers interested in offering a high video quality to the customers. Thus, an objective no-reference video quality assessment metric is proposed based on the sigmoid model using spatial-temporal features weighted by parameters obtained through the solution of a nonlinear least squares problem using the Levenberg-Marquardt algorithm. Experimental results show that when it is applied to MPEG-2 streams our method presents better linearity than full-reference metrics, and its performance is close to that achieved with full-reference metrics for H.264 streams.

View full abstract

Download PDF (928K)
A Real-Time Cascaded Video Denoising Algorithm Using Intensity and Structure Tensor

Xin TAN, Yu LIU, Huaxin XIAO, Maojun ZHANG

Article type: PAPER
Subject area: Image Processing and Video Processing
2015Volume E98.DIssue 7 Pages 1333-1342
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7435

JOURNAL FREE ACCESS

Show abstractHide abstract

A cascaded video denoising method based on frame averaging is proposed in this paper. A novel segmentation approach using intensity and structure tensor is used for change compensation, which can effectively suppress noise while preserving the structure of an image. The cascaded framework solves the problem of noise residual caused by single-frame averaging. The classical Wiener filter is used for spatial denoising in changing areas. Our algorithm works in real-time on an FPGA, since it does not involve future frames. Experiments on standard grayscale videos for various noise levels demonstrate that the proposed method is competitive with current state-of-the-art video denoising algorithms on both peak signal-to-noise ratio and structural similarity evaluations, particularly when dealing with large-scale noise.

View full abstract

Download PDF (2393K)
Reconstructing Sequential Patterns without Knowing Image Correspondences

Saba Batool MIYAN, Jun SATO

Article type: PAPER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 7 Pages 1343-1352
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7153

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we propose a method for reconstructing 3D sequential patterns from multiple images without knowing exact image correspondences and without calibrating linear camera sensitivity parameters on intensity. The sequential pattern is defined as a series of colored 3D points. We assume that the series of the points are obtained in multiple images, but the correspondence of individual points is not known among multiple images. For reconstructing sequential patterns, we consider a camera projection model which combines geometric and photometric information of objects. Furthermore, we consider camera projections in the frequency space. By considering the multi-view relationship on the new projection model, we show that the 3D sequential patterns can be reconstructed without knowing exact correspondence of individual image points in the sequential patterns; moreover, the recovered 3D patterns do not suffer from changes in linear camera sensitivity parameters. The efficiency of the proposed method is tested using real images.

View full abstract

Download PDF (5026K)
Automatic Detection of the Carotid Artery Location from Volumetric Ultrasound Images Using Anatomical Position-Dependent LBP Features

Fumi KAWAI, Satoshi KONDO, Keisuke HAYATA, Jun OHMIYA, Kiyoko ISHIKAWA ...

Article type: PAPER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 7 Pages 1353-1364
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7359

JOURNAL FREE ACCESS

Show abstractHide abstract

We propose a fully automatic method for detecting the carotid artery from volumetric ultrasound images as a preprocessing stage for building three-dimensional images of the structure of the carotid artery. The proposed detector utilizes support vector machine classifiers to discriminate between carotid artery images and non-carotid artery images using two kinds of LBP-based features. The detector switches between these features depending on the anatomical position along the carotid artery. We evaluate our proposed method using actual clinical cases. Accuracies of detection are 100%, 87.5% and 68.8% for the common carotid artery, internal carotid artery, and external carotid artery sections, respectively.

View full abstract

Download PDF (2119K)
A Breast Cancer Classifier Using a Neuron Model with Dendritic Nonlinearity

Zijun SHA, Lin HU, Yuki TODO, Junkai JI, Shangce GAO, Zheng TANG

Article type: PAPER
Subject area: Biocybernetics, Neurocomputing
2015Volume E98.DIssue 7 Pages 1365-1376
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDP7418

JOURNAL FREE ACCESS

Show abstractHide abstract

Breast cancer is a serious disease across the world, and it is one of the largest causes of cancer death for women. The traditional diagnosis is not only time consuming but also easily affected. Hence, artificial intelligence (AI), especially neural networks, has been widely used to assist to detect cancer. However, in recent years, the computational ability of a neuron has attracted more and more attention. The main computational capacity of a neuron is located in the dendrites. In this paper, a novel neuron model with dendritic nonlinearity (NMDN) is proposed to classify breast cancer in the Wisconsin Breast Cancer Database (WBCD). In NMDN, the dendrites possess nonlinearity when realizing the excitatory synapses, inhibitory synapses, constant-1 synapses and constant-0 synapses instead of being simply weighted. Furthermore, the nonlinear interaction among the synapses on a dendrite is defined as a product of the synaptic inputs. The soma adds all of the products of the branches to produce an output. A back-propagation-based learning algorithm is introduced to train the NMDN. The performance of the NMDN is compared with classic back propagation neural networks (BPNNs). Simulation results indicate that NMDN possesses superior capability in terms of the accuracy, convergence rate, stability and area under the ROC curve (AUC). Moreover, regarding ROC, for continuum values, the existing 0-connections branches after evolving can be eliminated from the dendrite morphology to release computational load, but with no influence on the performance of classification. The results disclose that the computational ability of the neuron has been undervalued, and the proposed NMDN can be an interesting choice for medical researchers in further research.

View full abstract

Download PDF (1514K)
Software Maintenance Evaluation of Agile Software Development Method Based on OpenStack

Yoji YAMATO, Shinichiro KATSURAGI, Shinji NAGAO, Norihiro MIURA

Article type: LETTER
Subject area: Software Engineering
2015Volume E98.DIssue 7 Pages 1377-1380
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2015EDL8049

JOURNAL FREE ACCESS

Show abstractHide abstract

We evaluated software maintenance of an open source cloud platform system we developed using an agile software development method. We previously reported on a rapid service launch using the agile software development method in spite of large-scale development. For this study, we analyzed inquiries and the defect removal efficiency of our recently developed software throughout one-year operation. We found that the defect removal efficiency of our recently developed software was 98%. This indicates that we could achieve sufficient quality in spite of large-scale agile development. In term of maintenance process, we could answer all enquiries within three business days and could conduct version-upgrade fast. Thus, we conclude that software maintenance of agile software development is not ineffective.

View full abstract

Download PDF (221K)
Outage Performance of MIMO Multihop Relay Network with MRT/RAS Scheme

Xinjie WANG, Yuzhen HUANG, Yansheng LI, Zhe-Ming LU

Article type: LETTER
Subject area: Information Network
2015Volume E98.DIssue 7 Pages 1381-1385
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8102

JOURNAL FREE ACCESS

Show abstractHide abstract

In this Letter, we investigate the outage performance of MIMO amplify-and-forward (AF) multihop relay networks with maximum ratio transmission/receiver antenna selection (MRT/RAS) over Nakagami-m fading channels in the presence of co-channel interference (CCI) or not. In particular, the lower bounds for the outage probability of MIMO AF multihop relay networks with/without CCI are derived, which provides an efficient means to evaluate the joint effects of key system parameters, such as the number of antennas, the interfering power, and the severity of channel fading. In addition, the asymptotic behavior of the outage probability is investigated, and the results reveal that the full diversity order can be achieved regardless of CCI. In addition, simulation results are provided to show the correctness of our derived analytical results.

View full abstract

Download PDF (271K)
Time Difference Estimation Based on Blind Beamforming for Wideband Emitter

Sen ZHONG, Wei XIA, Lingfeng ZHU, Zishu HE

Article type: LETTER
Subject area: Dependable Computing
2015Volume E98.DIssue 7 Pages 1386-1390
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2015EDL8001

JOURNAL FREE ACCESS

Show abstractHide abstract

In the localization systems based on time difference of arrival (TDOA), multipath fading and the interference source will deteriorate the localization performance. In response to this situation, TDOA estimation based on blind beamforming is proposed in the frequency domain. An additional constraint condition is designed for blind beamforming based on maximum power collecting (MPC). The relationship between the weight coefficients of the beamformer and TDOA is revealed. According to this relationship, TDOA is estimated by discrete Fourier transform (DFT). The efficiency of the proposed estimator is demonstrated by simulation results.

View full abstract

Download PDF (446K)
Fusion on the Wavelet Coefficients Scale-Related for Double Encryption Holographic Halftone Watermark Hidden Technology

Zifen HE, Yinhui ZHANG

Article type: LETTER
Subject area: Artificial Intelligence, Data Mining
2015Volume E98.DIssue 7 Pages 1391-1395
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8239

JOURNAL FREE ACCESS

Show abstractHide abstract

We present a new framework for embedding holographic halftone watermarking data into images by fusion of scale-related wavelet coefficients. The halftone watermarking image is obtained by using error-diffusion method and converted into Fresnel hologram, which is considered to be the initial password. After encryption, a scrambled watermarking image through Arnold transform is embedded into the host image during the halftoning process. We characterize the multi-scale representation of the original image using the discrete wavelet transform. The boundary information of the target image is fused by correlation of wavelet coefficients across wavelet transform layers to increase the pixel resolution scale. We apply the inter-scale fusion method to gain fusion coefficient of the fine-scale, which takes into account both the detail of the image and approximate information. Using the proposed method, the watermarking information can be embedded into the host image with recovery against the halftoning operation. The experimental results show that the proposed approach provides security and robustness against JPEG compression and different attacks compared to previous alternatives.

View full abstract

Download PDF (1090K)
Manifold Kernel Metric Learning for Larger-Scale Image Annotation

Lihua GUO

Article type: LETTER
Subject area: Pattern Recognition
2015Volume E98.DIssue 7 Pages 1396-1400
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8216

JOURNAL FREE ACCESS

Show abstractHide abstract

An appropriate similarity measure between images is one of the key techniques in search-based image annotation models. In order to capture the nonlinear relationships between visual features and image semantics, many kernel distance metric learning(KML) algorithms have been developed. However, when challenged with large-scale image annotation, their metrics can't explicitly represent the similarity between image semantics, and their algorithms suffer from high computation cost. Therefore, they always lose their efficiency. In this paper, we propose a manifold kernel metric learning (M_KML) algorithm. Our M_KML algorithm will simultaneously learn the manifold structure and the image annotation metrics. The main merit of our M_KML algorithm is that the distance metrics are builded on image feature's interior manifold structure, and the dimensionality reduction on manifold structure can handle the high dimensionality challenge faced by KML. Final experiments verify our method's efficiency and effectiveness by comparing it with state-of-the-art image annotation approaches.

View full abstract

Download PDF (275K)
Learning Deep Dictionary for Hyperspectral Image Denoising

Leigang HUO, Xiangchu FENG, Chunlei HUO, Chunhong PAN

Article type: LETTER
Subject area: Pattern Recognition
2015Volume E98.DIssue 7 Pages 1401-1404
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8246

JOURNAL FREE ACCESS

Show abstractHide abstract

Using traditional single-layer dictionary learning methods, it is difficult to reveal the complex structures hidden in the hyperspectral images. Motivated by deep learning technique, a deep dictionary learning approach is proposed for hyperspectral image denoising, which consists of hierarchical dictionary learning, feature denoising and fine-tuning. Hierarchical dictionary learning is helpful for uncovering the hidden factors in the spectral dimension, and fine-tuning is beneficial for preserving the spectral structure. Experiments demonstrate the effectiveness of the proposed approach.

View full abstract

Download PDF (907K)
Error Evaluation of an F0-Adaptive Spectral Envelope Estimator in Robustness against the Additive Noise and F0 Error

Masanori MORISE

Article type: LETTER
Subject area: Speech and Hearing
2015Volume E98.DIssue 7 Pages 1405-1408
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2015EDL8015

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper describes an evaluation of a temporally stable spectral envelope estimator proposed in our past research. The past research demonstrated that the proposed algorithm can synthesize speech that is as natural as the input speech. This paper focuses on an objective comparison, in which the proposed algorithm is compared with two modern estimation algorithms in terms of estimation performance and temporal stability. The results show that the proposed algorithm is superior to the others in both aspects.

View full abstract

Download PDF (529K)
Speech Emotion Recognition Based on Sparse Transfer Learning Method

Peng SONG, Wenming ZHENG, Ruiyu LIANG

Article type: LETTER
Subject area: Speech and Hearing
2015Volume E98.DIssue 7 Pages 1409-1412
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2015EDL8028

JOURNAL FREE ACCESS

Show abstractHide abstract

In traditional speech emotion recognition systems, when the training and testing utterances are obtained from different corpora, the recognition rates will decrease dramatically. To tackle this problem, in this letter, inspired from the recent developments of sparse coding and transfer learning, a novel sparse transfer learning method is presented for speech emotion recognition. Firstly, a sparse coding algorithm is employed to learn a robust sparse representation of emotional features. Then, a novel sparse transfer learning approach is presented, where the distance between the feature distributions of source and target datasets is considered and used to regularize the objective function of sparse coding. The experimental results demonstrate that, compared with the automatic recognition approach, the proposed method achieves promising improvements on recognition rates and significantly outperforms the classic dimension reduction based transfer learning approach.

View full abstract

Download PDF (263K)
Fast Barrel Distortion Correction for Wide-Angle Cameras

Tae-Hwan KIM

Article type: LETTER
Subject area: Image Processing and Video Processing
2015Volume E98.DIssue 7 Pages 1413-1416
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8236

JOURNAL FREE ACCESS

Show abstractHide abstract

Barrel distortion is a critical problem that can hinder the successful application of wide-angle cameras. This letter presents an implementation method for fast correction of the barrel distortion. In the proposed method, the required scaling factor is obtained by interpolating a mapping polynomial with a non-uniform spline instead of calculating it directly, which reduces the number of computations required for the distortion correction. This reduction in the number of computations leads to faster correction while maintaining quality: when compared to the conventional method, the reduction ratio of the correction time is about 89%, and the correction quality is 35.3 dB in terms of the average peak signal-to-noise ratio.

View full abstract

Download PDF (1019K)
A Study on Consistency between MINAVE and MINMAX in SSIM Based Independent Perceptual Video Coding

Chao WANG, Xuanqin MOU, Lei ZHANG

Article type: LETTER
Subject area: Image Processing and Video Processing
2015Volume E98.DIssue 7 Pages 1417-1421
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8253

JOURNAL FREE ACCESS

Show abstractHide abstract

In this letter, we study the R-D properties of independent sources based on MSE and SSIM, and compare the bit allocation performance under the MINAVE and MINMAX criteria in video encoding. The results show that MINMAX has similar results in terms of average distortion with MINAVE by using SSIM, which illustrates the consistency between these two criteria in independent perceptual video coding. Further more, MINMAX results in lower quality fluctuation, which shows its advantage for perceptual video coding.

View full abstract

Download PDF (1303K)
Learning Discriminative Features for Ground-Based Cloud Classification via Mutual Information Maximization

Shuang LIU, Zhong ZHANG, Baihua XIAO, Xiaozhong CAO

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 7 Pages 1422-1425
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8252

JOURNAL FREE ACCESS

Show abstractHide abstract

Texture feature descriptors such as local binary patterns (LBP) have proven effective for ground-based cloud classification. Traditionally, these texture feature descriptors are predefined in a handcrafted way. In this paper, we propose a novel method which automatically learns discriminative features from labeled samples for ground-based cloud classification. Our key idea is to learn these features through mutual information maximization which learns a transformation matrix for local difference vectors of LBP. The experimental results show that our learned features greatly improves the performance of ground-based cloud classification when compared to the other state-of-the-art methods.

View full abstract

Download PDF (430K)
Expose Spliced Photographic Basing on Boundary and Noise Features

Jun HOU, Yan CHENG

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 7 Pages 1426-1429
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2014EDL8232

JOURNAL FREE ACCESS

Show abstractHide abstract

The paper proposes an algorithm to expose spliced photographs. Firstly, a graph-based segmentation, which defines a predictor to measure boundary evidence between two neighbor regions, is used to make greedy decision. Then the algorithm gets prediction error image using non-negative linear least-square prediction. For each pair of segmented neighbor regions, the proposed algorithm gathers their statistic features and calculates features of gray level co-occurrence matrix. K-means clustering is applied to create a dictionary, and the vector quantization histogram is taken as the result vector with fixed length. For a tampered image, its noise satisfies Gaussian distribution with zero mean. The proposed method checks the similarity between noise distribution and a zero-mean Gaussian distribution, and follows with the local flatness and texture measurement. Finally, all features are fed to a support vector machine classifier. The algorithm has low computational cost. Experiments show its effectiveness in exposing forgery.

View full abstract

Download PDF (168K)
Visual Speech Recognition Using Weighted Dynamic Time Warping

Kyungsun LEE, Minseok KEUM, David K. HAN, Hanseok KO

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 7 Pages 1430-1433
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2015EDL8002

JOURNAL FREE ACCESS

Show abstractHide abstract

It is unclear whether Hidden Markov Model (HMM) or Dynamic Time Warping (DTW) mapping is more appropriate for visual speech recognition when only small data samples are available. In this letter, the two approaches are compared in terms of sensitivity to the amount of training samples and computing time with the objective of determining the tipping point. The limited training data problem is addressed by exploiting a straightforward template matching via weighted-DTW. The proposed framework is a refined DTW by adjusting the warping paths with judicially injected weights to ensure a smooth diagonal path for accurate alignment without added computational load. The proposed WDTW is evaluated on three databases (two in the public domain and one developed in-house) for visual recognition performance. Subsequent experiments indicate that the proposed WDTW significantly enhances the recognition rate compared to the DTW and HMM based algorithms, especially under limited data samples.

View full abstract

Download PDF (564K)
Discriminative Semantic Parts Learning for Object Detection

Yurui XIE, Qingbo WU, Bing LUO

Article type: LETTER
Subject area: Image Recognition, Computer Vision
2015Volume E98.DIssue 7 Pages 1434-1438
Published: July 01, 2015
Released on J-STAGE: July 01, 2015

DOIhttps://doi.org/10.1587/transinf.2015EDL8014

JOURNAL FREE ACCESS

Show abstractHide abstract

In this letter, we propose a new semantic parts learning approach to address the object detection problem with only the bounding boxes of object category labels. Our main observation is that even though the appearance and arrangement of object parts might have variations across the instances of different object categories, the constituent parts still maintain geometric consistency. Specifically, we propose a discriminative clustering method with sparse representation refinement to discover the mid-level semantic part set automatically. Then each semantic part detector is learned by the linear SVM in a one-vs-all manner. Finally, we utilize the learned part detectors to score the test image and integrate all the response maps of part detectors to obtain the detection result. The learned class-generic part detectors have the ability to capture the objects across different categories. Experimental results show that the performance of our approach can outperform some recent competing methods.

View full abstract

Download PDF (3015K)

Register with J-STAGE for free!