IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E98.D , Issue 1
Showing 1-31 articles out of 31 articles from the selected issue
Special Section on Enriched Multimedia
  • Isao ECHIZEN
    2015 Volume E98.D Issue 1 Pages 1
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Download PDF (72K)
  • Noboru BABAGUCHI, Yuta NAKASHIMA
    Type: INVITED PAPER
    2015 Volume E98.D Issue 1 Pages 2-9
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Our society has been getting more privacy-sensitive. Diverse information is given by users to information and communications technology (ICT) systems such as IC cards benefiting them. The information is stored as so-called big data, and there is concern over privacy violation. Visual information such as images and videos is also considered privacy-sensitive. The growing deployment of surveillance cameras and social network services has caused a privacy problem of information given from various sensors. To protect privacy of subjects presented in visual information, their face or figure is processed by means of pixelization or blurring. As image analysis technologies have made considerable progress, many attempts to automatically process flexible privacy protection have been made since 2000, and utilization of privacy information under some restrictions has been taken into account in recent years. This paper addresses the recent progress of privacy protection for visual information, showing our research projects: PriSurv, Digital Diorama (DD), and Mobile Privacy Protection (MPP). Furthermore, we discuss Harmonized Information Field (HIFI) for appropriate utilization of protected privacy information in a specific area.
    Download PDF (6693K)
  • Koichi KISE, Shinichiro OMACHI, Seiichi UCHIDA, Masakazu IWAMURA, Marc ...
    Type: INVITED PAPER
    2015 Volume E98.D Issue 1 Pages 10-20
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    This paper reviews several trials of re-designing conventional communication medium, i.e., characters, for enriching their functions by using data-embedding techniques. For example, characters are re-designed to have better machine-readability even under various geometric distortions by embedding a geometric invariant into each character image to represent class label of the character. Another example is to embed various information into handwriting trajectory by using a new pen device, called a data-embedding pen. An experimental result showed that we can embed 32-bit information into a handwritten line of 5 cm length by using the pen device. In addition to those applications, we also discuss the relationship between data-embedding and pattern recognition in a theoretical point of view. Several theories tell that if we have appropriate supplementary information by data-embedding, we can enhance pattern recognition performance up to 100%.
    Download PDF (1688K)
  • Toshihiro SAKANO, Yosuke KOBAYASHI, Kazuhiro KONDO
    Type: PAPER
    2015 Volume E98.D Issue 1 Pages 21-28
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    We proposed and evaluated a speech intelligibility estimation method that does not require a clean speech reference signal. The propose method uses the features defined in the ITU-T standard P.563, which estimates the overall quality of speech without the reference signal. We selected two sets of features from the P.563 features; the basic 9-feature set, which includes basic features that characterize both speech and background noise, e.g., cepstrum skewness and LPC kurtosis, and the extended 31-feature set with 22 additional features for a more accurate description of the degraded speech and noise, e.g., SNR, average pitch, and spectral clarity among others. Four hundred noise samples were added to speech, and about 70% of these samples were used to train a support vector regression (SVR) model. The trained models were used to estimate the intelligibility of speech degraded by added noise. The proposed method showed a root mean square error (RMSE) value of about 10% and correlation with subjective intelligibility of about 0.93 for speech distorted with known noise type, and RMSE of about 16% and a correlation of about 0.84 for speech distorted with unknown noise type, both with either the 9 or the 31-dimension feature set. These results were higher than the estimation using frequency-weighed SNR calculated in critical frequency bands, which requires the clean reference signal for its calculation. We believe this level of accuracy proves the proposed method to be applicable to real-time speech quality monitoring in the field.
    Download PDF (967K)
  • Shengbei WANG, Masashi UNOKI
    Type: PAPER
    2015 Volume E98.D Issue 1 Pages 29-37
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    This paper proposes a speech watermarking method based on the concept of formant tuning. The characteristic that formant tuning can improve the sound quality of synthesized speech was employed to achieve inaudibility for watermarking. In the proposed method, formants were firstly extracted with linear prediction (LP) analysis and then embedded with watermarks by symmetrically controlling a pair of line spectral frequencies (LSFs) as formant tuning. We evaluated the proposed method by two kinds of experiments regarding inaudibility and robustness compared with other methods. Inaudibility was evaluated with objective and subjective tests and robustness was evaluated with speech codecs and speech processing. The results revealed that the proposed method could satisfy both inaudibility and robustness that required for speech watermarking.
    Download PDF (1126K)
  • Masashi UNOKI, Ryota MIYAUCHI
    Type: PAPER
    2015 Volume E98.D Issue 1 Pages 38-48
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    We previously proposed an inaudible non-blind digital-audio watermarking approach based on cochlear delay (CD) characteristics. There are, however, three remaining issues with regard to blind-detectability, frame synchronization related to confidentiality, and reversibility. We attempted to solve these issues in developing the proposed approach by taking blind-detectability and reversibility of audio watermarking into consideration. Frame synchronization was also incorporated into the proposed approach to improve confidentiality. We evaluated inaudibility, robustness, and reversibility with the new approach by carrying out three objective tests (PEAQ, LSD, and bit-detection or SNR) and six robustness tests. The results revealed that inaudible, robust, blindly-detectable, and semi-reversible watermarking based on CD could be accomplished.
    Download PDF (1764K)
  • Kenji OZAWA, Shota TSUKAHARA, Yuichiro KINOSHITA, Masanori MORISE
    Type: PAPER
    2015 Volume E98.D Issue 1 Pages 49-57
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    The sense of presence is crucial to evaluate the performance of audio-visual (AV) equipment and content. Previously, the overall presence was evaluated for a set of AV content items by asking subjects to judge the presence of the entire content item. In this study, the sense of presence is evaluated for a time-series using the method of continuous judgment by category. Specifically, the audio signals of 40 content items with durations of approximately 30 s each were recorded with a dummy head, and then presented as stimuli to subjects via headphones. The corresponding visual signals were recorded using a video camera in the full-HD format, and reproduced on a 65-inch display. In the experiments, 20 subjects evaluated the instantaneous sense of presence of each item on a seven-point scale under two conditions: audio-only or audio-visual. At the end of the time-series, the subjects also evaluated the overall presence of the item by seven categories. Based on these results, the effects of visual information on the sense of presence were examined. The overall presence is highly correlated with the ten-percentile exceeded presence score, S10, which is the score that is exceeded for the 10% of the time during the responses. Based on the instantaneous presence data in this study, we are one step closer to our ultimate goal of developing a real-time operational presence meter.
    Download PDF (2236K)
  • Maki YOSHIDA, Kazuya OHKITA, Toru FUJIWARA
    Type: PAPER
    2015 Volume E98.D Issue 1 Pages 58-64
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    An important issue of fragile watermarking for image is to locate and restore the tampered pixels individually and accurately. This issue is resolved for concentrated tampering. In contrast, for diverse tampering, only localization is realized. This paper presents a restoration method for the most accurate scheme tolerant against diverse tampering. We analyze the error probability and experimentally confirm that the proposed method accurately restores the tampered pixels. We also show two variations based on the fact that the authentication data used for deriving the watermark is a maximum length sequence code.
    Download PDF (3566K)
  • Dong WANG, Hiroyuki MITSUHARA, Masami SHISHIBORI
    Type: PAPER
    2015 Volume E98.D Issue 1 Pages 65-77
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    It is significant to develop better search methods to handle the rapidly increasing volume of multimedia data. For NN (Nearest Neighbor) search in metric spaces, the TLAESA (Tree Linear Approximating and Eliminating Search Algorithm) is a state of art fast search method. In this paper a method is proposed to improve the TLAESA by revising the tree structure with an optimal number of selected global pivots in the higher levels as representatives and employing the best-first search strategy. Based on an improved version of the TLAESA that succeeds in using the best-first search strategy to greatly reduce the distance calculations, this method improves the drawback that calculating less at the price of the lower pruning rate of branches. The lower pruning rate further can lead to lower search efficiency, because the priority queue used in the adopted best-first search strategy stores the information of the visited but unpruned nodes, and need be frequently accessed and sorted. In order to enhance the pruning rate of branches, the improved method tries to make more selected global pivots locate in the higher levels of the search tree as representatives. As more real distances instead of lower bound estimations of the node-representatives are used for approximating the closet node and for “branch and bound”, not only which nodes are close to the query object can be evaluated more effectively, but also the pruning rate of branches can be enhanced. Experiments show that for k-NN queries in Euclidean space, in a proper pivot selection strategy the proposed method can reach the same fewest distance calculations as the LAESA (Linear Approximating and Eliminating Search Algorithm) which saves more calculations than the TLAESA, and can achieve a higher search efficiency than the TLAESA.
    Download PDF (1304K)
  • Hoang-Quoc NGUYEN-SON, Minh-Triet TRAN, Hiroshi YOSHIURA, Noboru SONEH ...
    Type: PAPER
    2015 Volume E98.D Issue 1 Pages 78-88
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    While online social networking is a popular way for people to share information, it carries the risk of unintentionally disclosing personal information. One way to reduce this risk is to anonymize personal information in messages before they are posted. Furthermore, if personal information is somehow disclosed, the person who disclosed it should be identifiable. Several methods developed for anonymizing personal information in natural language text simply remove sensitive phrases, making the anonymized text message unnatural. Other methods change the message by using synonymization or structural alteration to create fingerprints for detecting disclosure, but they do not support the creation of a sufficient number of fingerprints for friends of an online social network user. We have developed a system for anonymizing personal information in text messages that generalizes sensitive phrases. It also creates a sufficient number of fingerprints of a message by using synonyms so that, if personal information is revealed online, the person who revealed it can be identified. A distribution metric is used to ensure that the degree of anonymization is appropriate for each group of friends. A threshold is used to improve the naturalness of the fingerprinted messages so that they do not catch the attention of attackers. Evaluation using about 55,000 personal tweets in English demonstrated that our system creates sufficiently natural fingerprinted messages for friends and groups of friends. The practicality of the system was demonstrated by creating a web application for controlling messages posted on Facebook.
    Download PDF (1930K)
  • Harumi MURATA, Akio OGIHARA, Masaki UESAKA
    Type: LETTER
    2015 Volume E98.D Issue 1 Pages 89-94
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Yajima et al. proposed a method based on amplitude and phase coding of audio signals. This method has relatively high sound quality because human auditory property is considered for embedding. However, in this method, the tolerance to attacks tends to be weak. Hence, we propose a high-tolerance watermarking method using BCH code which is one of error correcting code. This paper evaluates whether our method preserves the sound quality while ensuring high tolerance.
    Download PDF (552K)
  • Sangwook LEE, Ji Eun SONG, Wan Yeon LEE, Young Woong KO, Heejo LEE
    Type: LETTER
    2015 Volume E98.D Issue 1 Pages 95-97
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    For digital forensic investigations, the proposed scheme verifies the integrity of video contents in legacy surveillance camera systems with no built-in integrity protection. The scheme exploits video frames remaining in slack space of storage media, instead of timestamp information vulnerable to tampering. The scheme is applied to integrity verification of video contents formatted with AVI or MP4 files in automobile blackboxes.
    Download PDF (391K)
  • Shigeyuki KOMURO, Shigeru KURIYAMA, Takao JINNO
    Type: LETTER
    2015 Volume E98.D Issue 1 Pages 98-102
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Multimedia contents can be enriched by introducing navigation with image codes readable by camera-mounted mobile devices such as smartphones. Data hiding technologies were utilized for embedding such codes to make their appearances inconspicuous, which can reduce esthetic damage on visual media. This article proposes a method of embedding two-dimensional codes into images based on successive color mixture for a blue-color channel. This technology can make the color of codes mimic those used on a cover image, while preserving their readability for current general purpose image sensors.
    Download PDF (994K)
Regular Section
  • Sung Kwon KIM
    Type: PAPER
    Subject area: Fundamentals of Information Systems
    2015 Volume E98.D Issue 1 Pages 103-107
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Given an edge-weighted tree with n vertices and a positive integer L, the length-constrained maximum-density path problem is to find a path of length at least L with maximum density in the tree. The density of a path is the sum of the weights of the edges in the path divided by the number of edges in the path. We present an O(n) time algorithm for the problem. The previously known algorithms run in O(nL) or O(n log n) time.
    Download PDF (333K)
  • Kazuyuki AMANO, Atsushi SAITO
    Type: PAPER
    Subject area: Fundamentals of Information Systems
    2015 Volume E98.D Issue 1 Pages 108-118
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Recently, Impagliazzo et al. constructed a nontrivial algorithm for the satisfiability problem for sparse threshold circuits of depth two which is a class of circuits with cn wires. We construct a nontrivial algorithm for a larger class of circuits. Two gates in the bottom level of depth two threshold circuits are dependent, if the output of the one is always greater than or equal to the output of the other one. We give a nontrivial circuit satisfiability algorithm for a class of circuits which may not be sparse in gates with dependency. One of our motivations is to consider the relationship between the various circuit classes and the complexity of the corresponding circuit satisfiability problem of these classes. Another background is proving strong lower bounds for TC0 circuits, exploiting the connection which is initiated by Ryan Williams between circuit satisfiability algorithms and lower bounds.
    Download PDF (262K)
  • Yu KASHIMA, Takashi ISHIO, Katsuro INOUE
    Type: PAPER
    Subject area: Software Engineering
    2015 Volume E98.D Issue 1 Pages 119-130
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Program slicing is an important approach for debugging, program comprehension, impact analysis, etc. There are various program slicing techniques ranging from the lightweight to the more accurate but heavyweight. Comparative analyses are important for selecting the most appropriate technique. This paper presents a comparative study of four backward program slicing techniques for Java. The results show the scalability and precision of these techniques. We develop guidelines that indicate which slicing techniques are appropriate for different situations, based on the results.
    Download PDF (863K)
  • Jie REN, Ling GAO, Hai WANG, Yan CHEN
    Type: PAPER
    Subject area: Information Network
    2015 Volume E98.D Issue 1 Pages 131-139
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    The endurance time of smartphone still suffer from the limited battery capacity, and smartphone apps will increase the burden of the battery if they download large data over slow network. So how to manage the download tasks is an important work. To this end we propose a smartphone download strategy with low energy consumption which called CLSA (Concentrated Download and Low Power and Stable Link Selection Algorithm). The CLSA is intended to reduce the overhead of large data downloads by appropriate delay for the smartphone, and it based on three major factors: the current network situation, the length of download requests' queue and the local information of smartphone. We evaluate the CLSA using a music player implementation on ZTE V880 smartphone running the Android operation system, and compare it with the other two general download strategies, Minimum Delay and WiFi Only. Experiments show that our download algorithm can achieve a better trade-off between energy and delay than the other two.
    Download PDF (1024K)
  • Jing XUN, Ke-Ping LI, Yuan CAO
    Type: PAPER
    Subject area: Information Network
    2015 Volume E98.D Issue 1 Pages 140-147
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Headway irregularity not only increases average passenger waiting time but also causes additional energy consumption and more delay time. A real-time headway control model is proposed to maintain headway regularity in railway networks by adjusting the travel time on each segment for each train. The adjustment of travel time is based on a consensus algorithm. In the proposed consensus algorithm, the control law is obtained by solving the Riccati equation. The minimum running time on a segment is also considered. The computation time of the proposed method is analyzed and the analysis results show that it can satisfy the requirement on real-time operation. The proposed model is tested and the consensus trend of headways can be observed through simulation. The simulation results also demonstrate that the average passenger waiting time decreases from 52 to 50 seconds/passenger. Additionally, the delay time is reduced by 6.5% at least and energy consumption can be reduced by 0.1% at most after using the proposed method.
    Download PDF (763K)
  • Xiaoyun WANG, Jinsong ZHANG, Masafumi NISHIDA, Seiichi YAMAMOTO
    Type: PAPER
    Subject area: Speech and Hearing
    2015 Volume E98.D Issue 1 Pages 148-156
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    This paper describes a novel method to improve the performance of second language speech recognition when the mother tongue of users is known. Considering that second language speech usually includes less fluent pronunciation and more frequent pronunciation mistakes, the authors propose using a reduced phoneme set generated by a phonetic decision tree (PDT)-based top-down sequential splitting method instead of the canonical one of the second language. The authors verify the efficacy of the proposed method using second language speech collected with a translation game type dialogue-based English CALL system. Experiments show that a speech recognizer achieved higher recognition accuracy with the reduced phoneme set than with the canonical phoneme set.
    Download PDF (1049K)
  • Yusuke IJIMA, Hideyuki MIZUNO
    Type: PAPER
    Subject area: Speech and Hearing
    2015 Volume E98.D Issue 1 Pages 157-165
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    This paper analyzes the correlation between various acoustic features and perceptual voice quality similarity, and proposes a perceptually similar speaker selection technique based on distance metric learning. To analyze the relationship between acoustic features and voice quality similarity, we first conduct a large-scale subjective experiment using the voices of 62 female speakers and perceptual voice quality similarity scores between all pairs of speakers are acquired. Next, multiple linear regression analysis is carried out; it shows that four acoustic features are highly correlated to voice quality similarity. The proposed speaker selection technique first trains a transform matrix based on distance metric learning using the perceptual voice quality similarity acquired in the subjective experiment. Given an input speech, acoustic features of the input speech are transformed using the trained transform matrix, after which speaker selection is performed based on the Euclidean distance on the transformed acoustic feature space. We perform speaker selection experiments and evaluate the performance of the proposed technique by comparing it to speaker selection without feature space transformation. The results indicate that transformation based on distance metric learning reduces the error rate by 53.9%.
    Download PDF (762K)
  • Nga H. DO, Keiji YANAI
    Type: PAPER
    Subject area: Image Processing and Video Processing
    2015 Volume E98.D Issue 1 Pages 166-172
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    In this paper, we propose a novel ranking method called VisualTextualRank which ranks media data according to the relevance between the data and specified keywords. We apply our method to the system of video shot ranking which aims to automatically obtain video shots corresponding to given action keywords from Web videos. The keywords can be any type of action such as “surfing wave” (sport action) or “brushing teeth” (daily activity). Top ranked video shots are expected to be relevant to the keywords. While our baseline exploits only visual features of the data, the proposed method employs both textual information (tags) and visual features. Our method is based on random walks over a bipartite graph to integrate visual information of video shots and tag information of Web videos effectively. Note that instead of treating the textual information as an additional feature for shot ranking, we explore the mutual reinforcement between shots and textual information of their corresponding videos to improve shot ranking. We validated our framework on a database which was used by the baseline. Experiments showed that our proposed ranking method, VisualTextualRank, improved significantly the performance of the system of video shot extraction over the baseline.
    Download PDF (2656K)
  • Ho-Gun HA, Dae-Chul KIM, Wang-Jun KYUNG, Yeong-Ho HA
    Type: PAPER
    Subject area: Image Processing and Video Processing
    2015 Volume E98.D Issue 1 Pages 173-179
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    In digital cinema, an image goes through many types of processes like scanning, mastering, and digital intermediate. Among them, the digital intermediate process plays a central role because it determines the final color of an image. It edits and changes the colors of the images. However, some color distortions such as color bleeding are generated when editing and changing local colors in an image. In this paper, local color improvement for digital intermediate is proposed based on color transfer. Our method is simple and efficient color improvement that does not requires neither precise image segmentation nor feature matching. To prevent color distortions, a modified color influence map is proposed with color categories. First, the source image is roughly segmented using a color category map, which groups similar colors in color space. Second, the color influence map is modified by assigning different weights to the lightness and chroma components. Lastly, the modified color influence map and color category map filtered with anisotropic diffusion are combined. Experimental results show that the proposed method produces less color distortion in the resulting image.
    Download PDF (4879K)
  • Lu TIAN, Shengjin WANG
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2015 Volume E98.D Issue 1 Pages 180-188
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Person re-identification is a challenging problem of matching observations of individuals across non-overlapping camera views. When pedestrians walk across disjoint camera views, continuous motion information is lost, and thus re-identification mainly relies on appearance matching. Person re-identification is actually a special case of near duplicate search in image retrieval. Given a probe, our task is to find the image containing the same person in galleries. At present many state-of-the-art methods in image retrieval are based on the Bag-of-Words (BOW) model. By adapting the BOW model to our task, Bag-of-Ensemble-Colors (BOEC) is proposed to tackle person re-identification in this paper. We combine low-level color histogram and semantic color names to represent human appearances. Meanwhile, some mature and efficient techniques in image retrieval are employed in the model containing soft quantization, burstiness punishing strategy, and negative evidence. In consideration apriori knowledge of human body structure, efficient spatial constraints are proposed to weaken the influence of background. Extensive experiments on VIPeR and ETHZ databases are performed to test the effectiveness of our approach, and promising results are obtained in the public databases. Compared with other unsupervised methods, we obtain state-of-the-art performances. The recognition rate is 32.23% on VIPeR dataset, 87% on ETHZ SEQ.#1, 83% on ETHZ SEQ.#2, and 91% on ETHZ SEQ.#3.
    Download PDF (2009K)
  • Hong WANG, Yue-hua LI, Ben-qing WANG
    Type: LETTER
    Subject area: Fundamentals of Information Systems
    2015 Volume E98.D Issue 1 Pages 189-192
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    This paper presents a novel signal analysis algorithm, named High-order Bi-orthogonal Fourier Transform (HBFT), which can be seen as an expansion of Fourier transform. The HBFT formula and discrete HBFT formula are derived, some of their main characteristics are briefly discusses. This paper also uses HBFT to analyze the multi-LFM signals, obtain the modulate rate parameters, analyze the high dynamic signals, and obtain the accelerated and varying accelerated motion parameters. The result proves that HBFT is suitable for analysis of the non-stability signals with high-order components.
    Download PDF (247K)
  • SangHyuck BAE, DoYoung JUNG, CheolSe KIM, KyoungMoon LIM, Yong-Surk LE ...
    Type: LETTER
    Subject area: Human-computer Interaction
    2015 Volume E98.D Issue 1 Pages 193-196
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    For a large-sized touch screen, we designed and evaluated a real-time touch microarchitecture using a field-programmable gate array (FPGA). A high-speed hardware accelerator based on a parallel touch algorithm is suggested and implemented in this letter. The touch controller also has a timing control unit and an analog digital convert (ADC) control unit for analog touch sensing circuits. Measurement results of processing time showed that the touch controller with its proposed microarchitecture is five times faster than the 32-bit reduced instruction set computer (RISC) processor without the touch accelerator.
    Download PDF (978K)
  • Hyunki LIM, Jaesung LEE, Dae-Won KIM
    Type: LETTER
    Subject area: Pattern Recognition
    2015 Volume E98.D Issue 1 Pages 197-200
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    We propose a new multi-label feature selection method that does not require the multi-label problem to be transformed into a single-label problem. Using quadratic programming, the proposed multi-label feature selection algorithm provides markedly better learning performance than conventional methods.
    Download PDF (272K)
  • Yinhui ZHANG, Zifen HE, Changyu LIU
    Type: LETTER
    Subject area: Pattern Recognition
    2015 Volume E98.D Issue 1 Pages 201-205
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Segmenting foreground objects from highly dynamic scenes with missing data is very challenging. We present a novel unsupervised segmentation approach that can cope with extensive scene dynamic as well as a substantial amount of missing data that present in dynamic scene. To make this possible, we exploit convex optimization of total variation beforehand for images with missing data in which depletion mask is available. Inpainting depleted images using total variation facilitates detecting ambiguous objects from highly dynamic images, because it is more likely to yield areas of object instances with improved grayscale contrast. We use a conditional random field that adapts to integrate both appearance and motion knowledge of the foreground objects. Our approach segments foreground object instances while inpainting the highly dynamic scene with a variety amount of missing data in a coupled way. We demonstrate this on a very challenging dataset from the UCSD Highly Dynamic Scene Benchmarks (HDSB) and compare our method with two state-of-the-art unsupervised image sequence segmentation algorithms and provide quantitative and qualitative performance comparisons.
    Download PDF (2722K)
  • Tomohiro TAKAHASHI, Kazunori URUMA, Katsumi KONISHI, Toshihiro FURUKAW ...
    Type: LETTER
    Subject area: Speech and Hearing
    2015 Volume E98.D Issue 1 Pages 206-209
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    This letter deals with the signal declipping algorithm based on the matrix rank minimization approach, which can be applied to the signal restoration in linear systems. We focus on the null space of a low-rank matrix and provide a block adaptive algorithm of the matrix rank minimization approach to signal declipping based on the null space alternating optimization (NSAO) algorithm. Numerical examples show that the proposed algorithm is faster and has better performance than other algorithms.
    Download PDF (387K)
  • Hanhoon PARK, Kwang-Seok MOON
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2015 Volume E98.D Issue 1 Pages 210-213
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    Speeded up robust features (SURF) can detect/describe scale- and rotation-invariant features at high speed by relying on integral images for image convolutions. However, the time taken for matching SURF descriptors is still long, and this has been an obstacle for use in real-time applications. In addition, the matching time further increases in proportion to the number of features and the dimensionality of the descriptor. Therefore, we propose a fast matching method that rearranges the elements of SURF descriptors based on their entropies, divides SURF descriptors into sub-descriptors, and sequentially and analytically matches them to each other. Our results show that the matching time could be reduced by about 75% at the expense of a small drop in accuracy.
    Download PDF (421K)
  • Takuya TAKASU, Yoshiki KUMAGAI, Gosuke OHASHI
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2015 Volume E98.D Issue 1 Pages 214-217
    Published: 2015
    Released: January 01, 2015
    JOURNALS FREE ACCESS
    We previously proposed a query-by-sketch image retrieval system that uses an edge relation histogram (ERH). However, it is difficult for this method to retrieve partial objects from an image, because the ERH is a feature of the entire image, not of each object. Therefore, we propose an object-extraction method that uses edge-based features in order to enable the query-by-sketch system to retrieve partial images. This method is applied to 20,000 images from the Corel Photo Gallery. We confirm that retrieval accuracy is improved by using the edge-based features for extracting objects, enabling the query-by-sketch system to retrieve partial images.
    Download PDF (2610K)
Errata
feedback
Top