IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E102.D , Issue 1
Showing 1-27 articles out of 27 articles from the selected issue
Special Section on Enriched Multimedia — Making Multimedia More Convenient and Safer —
  • Keiichi IWAMURA
    2019 Volume E102.D Issue 1 Pages 1
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS
    Download PDF (59K)
  • Kenta IIDA, Hitoshi KIYA
    Type: PAPER
    2019 Volume E102.D Issue 1 Pages 2-10
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    In the case that images are shared via social networking services (SNS) and cloud photo storage services (CPSS), it is known that the JPEG images uploaded to the services are mostly re-compressed by the providers. Because of such a situation, a new image identification scheme for double-compressed JPEG images is proposed in this paper. The aim is to detect a single-compressed image that has the same original image as the double-compressed ones. In the proposed scheme, a feature extracted from only DC coefficients in DCT coefficients is used for the identification. The use of the feature allows us not only to robustly avoid errors caused by double-compression but also to perform the identification for different size images. The simulation results demonstrate the effectiveness of the proposed one in terms of the querying performance.

    Download PDF (1706K)
  • Tatsuya CHUMAN, Kenta IIDA, Warit SIRICHOTEDUMRONG, Hitoshi KIYA
    Type: PAPER
    2019 Volume E102.D Issue 1 Pages 11-18
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Encryption-then-Compression (EtC) systems have been proposed to securely transmit images through an untrusted channel provider. In this study, EtC systems were applied to social media like Twitter that carry out image manipulations. The block scrambling-based encryption schemes used in EtC systems were evaluated in terms of their robustness against image manipulation on social media. The aim was to investigate how five social networking service (SNS) providers, Facebook, Twitter, Google+, Tumblr and Flickr, manipulate images and to determine whether the encrypted images uploaded to SNS providers can avoid being distorted by such manipulations. In an experiment, encrypted and non-encrypted JPEG images were uploaded to various SNS providers. The results show that EtC systems are applicable to the five SNS providers.

    Download PDF (1218K)
  • Ippei HAMAMOTO, Masaki KAWAMURA
    Type: PAPER
    2019 Volume E102.D Issue 1 Pages 19-30
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    An autoencoder has the potential ability to compress and decompress information. In this work, we consider the process of generating a stego-image from an original image and watermarks as compression, and the process of recovering the original image and watermarks from the stego-image as decompression. We propose embedder and extractor neural networks based on the autoencoder. The embedder network learns mapping from the DCT coefficients of the original image and a watermark to those of the stego-image. The extractor network learns mapping from the DCT coefficients of the stego-image to the watermark. Once the proposed neural network has been trained, the network can embed and extract the watermark into unlearned test images. We investigated the relation between the number of neurons and network performance by computer simulations and found that the trained neural network could provide high-quality stego-images and watermarks with few errors. We also evaluated the robustness against JPEG compression and found that, when suitable parameters were used, the watermarks were extracted with an average BER lower than 0.01 and image quality over 35 dB when the quality factor Q was over 50. We also investigated how to represent the watermarks in the stego-image by our neural network. There are two possibilities: distributed representation and sparse representation. From the results of investigation into the output of the stego layer (3rd layer), we found that the distributed representation emerged at an early learning step and then sparse representation came out at a later step.

    Download PDF (1613K)
  • Hiroshi ITO, Tadashi KASEZAWA
    Type: PAPER
    2019 Volume E102.D Issue 1 Pages 31-40
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Generation of secure signatures suitable for spread-spectrum video watermarking is proposed. The method embeds a message, which is a two-dimensional binary pattern, into a three-dimensional volume, such as video, by addition of a signature. The message can be a mark or a logo indicating the copyright information. The signature is generated by shuffling or permuting random matrices along the third or time axis so that the message is extracted when they are accumulated after demodulation by the correct key. In this way, a message is hidden in the signature having equal probability of decoding any variation of the message, where the key is used to determine which one to extract. Security of the proposed method, stemming from the permutation, is evaluated as resistance to blind estimation of secret information. The matrix-based permutation allows the message to survive the spatial down-sampling without sacrificing the security. The downside of the proposed method is that it needs more data or frames to decode a reliable information compared to the conventional spread-spectrum modulation. However this is minimized by segmenting the matrices and applying permutation to sub-matrices independently. Message detectability is theoretically analyzed. Superiority of our method in terms of robustness to blind message estimation and down-sampling is verified experimentally.

    Download PDF (1060K)
  • Minoru KURIBAYASHI, Takuya FUKUSHIMA, Nobuo FUNABIKI
    Type: PAPER
    2019 Volume E102.D Issue 1 Pages 41-47
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    The spaces between words and paragraphs are popular places for embedding data in data hiding techniques for text documents. Due to the low redundancy in text documents, the payload is limited to be small. As each bit of data is independently inserted into specific spaces in conventional methods, a malicious party may be able to modify the data without causing serious visible distortions. In this paper, we regard a collection of space lengths as a one-dimensional feature vector and embed watermark into its frequency components. To keep the secrecy of the embedded information, a random permutation and dither modulation are introduced in the operation. Furthermore, robustness against additive noise is enhanced by controlling the payload. In the proposed method, through experiments, we evaluated the trade-off among payload, distortion, and robustness.

    Download PDF (659K)
  • Duc V. NGUYEN, Huyen T. T. TRAN, Truong Cong THANG
    Type: LETTER
    2019 Volume E102.D Issue 1 Pages 48-51
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    360-degree video is an important component of the emerging Virtual Reality. In this paper, we propose a new adaptation method for tiling-based viewport adaptive streaming of 360-degree video. The proposed method is able to dynamically select the best tiling scheme given the network conditions and user status. Experiments show that our proposed method can improve the viewport quality by up to 2.3 dB compared to a conventional fixed tiling method.

    Download PDF (435K)
Regular Section
  • Takashi YOKOTA, Kanemitsu OOTSU, Takeshi OHKAWA
    Type: PAPER
    Subject area: Computer System
    2019 Volume E102.D Issue 1 Pages 52-74
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    State-of-the-art parallel systems employ a huge number of computing nodes that are connected by an interconnection network. An interconnection network (ICN) plays an important role in a parallel system, since it is responsible to communication capability. In general, an ICN shows non-linear phenomena in its communication performance, most of them are caused by congestion. Thus, designing a large-scale parallel system requires sufficient discussions through repetitive simulation runs. This causes another problem in simulating large-scale systems within a reasonable cost. This paper shows a promising solution by introducing the cellular automata concept, which is originated in our prior work. Assuming 2D-torus topologies for simplification of discussion, this paper discusses fundamental design of router functions in terms of cellular automata, data structure of packets, alternative modeling of a router function, and miscellaneous optimization. The proposed models have a good affinity to GPGPU technology and, as representative speed-up results, the GPU-based simulator accelerates simulation upto about 1264 times from sequential execution on a single CPU. Furthermore, since the proposed models are applicable in the shared memory model, multithread implementation of the proposed methods achieve about 162 times speed-ups at the maximum.

    Download PDF (3682K)
  • Haijin JI, Song HUANG, Xuewei LV, Yaning WU, Yuntian FENG
    Type: PAPER
    Subject area: Software Engineering
    2019 Volume E102.D Issue 1 Pages 75-84
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Software defect prediction (SDP) plays a significant part in allocating testing resources reasonably, reducing testing costs, and ensuring software quality. One of the most widely used algorithms of SDP models is Naive Bayes (NB) because of its simplicity, effectiveness and robustness. In NB, when a data set has continuous or numeric attributes, they are generally assumed to follow normal distributions and incorporate the probability density function of normal distribution into their conditional probabilities estimates. However, after conducting a Kolmogorov-Smirnov test, we find that the 21 main software metrics follow non-normal distribution at the 5% significance level. Therefore, this paper proposes an improved NB approach, which estimates the conditional probabilities of NB with kernel density estimation of training data sets, to help improve the prediction accuracy of NB for SDP. To evaluate the proposed method, we carry out experiments on 34 software releases obtained from 10 open source projects provided by PROMISE repository. Four well-known classification algorithms are included for comparison, namely Naive Bayes, Support Vector Machine, Logistic Regression and Random Tree. The obtained results show that this new method is more successful than the four well-known classification algorithms in the most software releases.

    Download PDF (2961K)
  • Guo-chao FAN, Chun-sheng HU, Xue-en ZHENG, Cheng-dong XU
    Type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2019 Volume E102.D Issue 1 Pages 85-92
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    In GNSS (Global Navigation Satellite System) Distributed Simulation Environment (GDSE), the simulation task could be designed with the sharing models on the Internet. However, too much information and relation of model need to be managed in GDSE. Especially if there is a large quantity of sharing models, the model retrieval would be an extremely complex project. For meeting management demand of GDSE and improving the model retrieval efficiency, the characteristics of service simulation model are analysed firstly. A semantic management method of simulation model is proposed, and a model management architecture is designed. Compared with traditional retrieval way, it takes less retrieval time and has a higher accuracy result. The simulation results show that retrieval in the semantic management module has a good ability on understanding user needs, and helps user obtain appropriate model rapidly. It improves the efficiency of simulation tasks design.

    Download PDF (2393K)
  • Daisuke YAMAMOTO, Masaki MURASE, Naohisa TAKAHASHI
    Type: PAPER
    Subject area: Data Engineering, Web Information Systems
    2019 Volume E102.D Issue 1 Pages 93-103
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Road generalization is a method for thinning out road networks to allow easy viewing according to the size of the map. Most conventional road generalization methods mainly focus on the length of a stroke, which is a chain of links with good continuity based on the principle of perceptual grouping applied to network data such as roads and rivers. However, in the case of facility search in a web map service, for example, a “restaurant guide map,” a road generalization mechanism can be more effective if it depends not only on the stroke length but also on the facility search results. Accordingly, in this study, we implement an on-demand road generalization method that adapts to both the facility search results and the stroke length. Moreover, a sufficiently fast response speed is achieved for practical use in web map services. In particular, this study proposes a fat-stroke model that links facility information to individual strokes and implements a road generalization method that uses this model to improve the response time. In addition, we develop a prototype based on the proposed system. The system evaluation results are based on three indicators, namely, response time of the road generalization system, connectivity between strokes, and connectivity between stroke and facilities. Our experimental results suggest that the proposed method can yield improved response times by a factor of 100 or more while affording higher connectivity.

    Download PDF (3859K)
  • Yasser MOHAMMAD, Kazunori MATSUMOTO, Keiichiro HOASHI
    Type: PAPER
    Subject area: Information Network
    2019 Volume E102.D Issue 1 Pages 104-115
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Activity recognition from sensors is a classification problem over time-series data. Some research in the area utilize time and frequency domain handcrafted features that differ between datasets. Another categorically different approach is to use deep learning methods for feature learning. This paper explores a middle ground in which an off-the-shelf feature extractor is used to generate a large number of candidate time-domain features followed by a feature selector that was designed to reduce the bias toward specific classification techniques. Moreover, this paper advocates the use of features that are mostly insensitive to sensor orientation and show their applicability to the activity recognition problem. The proposed approach is evaluated using six different publicly available datasets collected under various conditions using different experimental protocols and shows comparable or higher accuracy than state-of-the-art methods on most datasets but usually using an order of magnitude fewer features.

    Download PDF (946K)
  • Yeong-Mo YEON, Seung-Hee KIM
    Type: PAPER
    Subject area: Information Network
    2019 Volume E102.D Issue 1 Pages 116-123
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    The CC-Link proposed by the Mitsubishi Electric Company is an industrial network used exclusively in most industries. However, the probabilities of data loss and interference with equipment control increase if the transmission time is greater than the link scan time of 381µs. The link scan time can be reduced by designing the CC-Link module as an external microprocessor (MPU) interface of R-IN32M3; however, it then suffers from expandability issues. Thus, in this paper, we propose a new CC-Link module utilizing R-IN32M3 to improve the expandability. In our designed CC-Link module, we devise a dual-port RAM (DPRAM) function in an external I/O module, which enables parallel communication between the DPRAM and the external MPU. Our experiment with the implemented CC-Link prototype demonstrates that our CC-Link design improves the communication speed owing to the parallel communication between DPRAM and external MPU, and expandability of remote I/O. Our design achieves miniaturization of the CC-Link module, wiring reduction, and an approximately 30% reduction in the link scan time. Furthermore, because we utilize both the Renesas R-IN32M3 and Xilinx XC95144XL chips widely used in diverse application areas, the designed CC-Link module reduces the investment cost. The proposed design is expected to significantly contribute to the utilization of the programmable logic controller memory and I/O expansion for factory automation and improvement of the investment efficiency in the flat panel display industry.

    Download PDF (1931K)
  • Amin JAMALI, Seyed Mostafa SAFAVI HEMAMI, Mehdi BERENJKOUB, Hossein SA ...
    Type: PAPER
    Subject area: Information Network
    2019 Volume E102.D Issue 1 Pages 124-132
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Device-to-device (D2D) communication in cellular networks is defined as direct communication between two mobile users without traversing the base station (BS) or core network. D2D communication can occur on the cellular frequencies (i.e., inband) or unlicensed spectrum (i.e., outband). A high capacity IEEE 802.11-based outband device-to-device communication system for cellular networks is introduced in this paper. Transmissions in device-to-device connections are managed using our proposed medium access control (MAC) protocol. In the proposed MAC protocol, backoff window size is adjusted dynamically considering the current network status and utilizing an appropriate transmission attempt rate. We have considered both cases that the request to send/clear to send (RTS/CTS) mechanism is and is not used in our protocol design. Describing mechanisms for guaranteeing quality of service (QoS) and enhancing reliability of the system is another part of our work. Moreover, performance of the system in the presence of channel impairments is investigated analytically and through simulations. Analytical and simulation results demonstrate that our proposed system has high throughput, and it can provide different levels of QoS for its users.

    Download PDF (3453K)
  • Juan CHEN, Shen SU, Xianzhi WANG
    Type: PAPER
    Subject area: Information Network
    2019 Volume E102.D Issue 1 Pages 133-146
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Location sharing services have recently gained momentum over mobile online social networks (mOSNs), seeing the increasing popularity of GPS-capable mobile devices such as smart phones. Despite the convenience brought by location sharing, there comes severe privacy risks. Though many efforts have been made to protect user privacy during location sharing, many of them rely on the extensive deployment of trusted Cellular Towers (CTs) and some incur excessive time overhead. More importantly, little research so far can support complete privacy including location privacy, identity privacy and social relation privacy. We propose SAM, a new System Architecture for mOSNs, and P3S, a Privacy-Preserving Protocol based on SAM, to address the above issues for privacy-preserving location sharing over mOSNs. SAM and P3S differ from previous work in providing complete privacy for location sharing services over mOSNs. Theoretical analysis and extensive experimental results demonstrate the feasibility and efficiency of the proposed system and protocol.

    Download PDF (1480K)
  • Zhiming WU, Tao LIN, Ming LI
    Type: PAPER
    Subject area: Educational Technology
    2019 Volume E102.D Issue 1 Pages 147-155
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Handwriting difficulties (HWDs) in children have adverse effects on their confidence and academic progress. Detecting HWDs is the first crucial step toward clinical or teaching intervention for children with HWDs. To date, how to automatically detect HWDs is still a challenge, although digitizing tablets have provided an opportunity to automatically collect handwriting process information. Especially, to our best knowledge, there is no exploration into the potential of combining machine learning algorithms and the handwriting process information to automatically detect Chinese HWDs in children. To bridge the gap, we first conducted an experiment to collect sample data and then compared the performance of five commonly used classification algorithms (Decision tree, Support Vector Machine (SVM), Artificial Neural Network, Naïve Bayesian and k-Nearest Neighbor) in detecting HWDs. The results showed that: (1) only a small proportion (13%) of children had Chinese HWDs and each classification model on the imbalanced dataset (39 children at risk of HWDs versus 261 typical children) produced the results that were better than random guesses, indicating the possibility of using classification algorithms to detect Chinese HWDs; (2) the SVM model had the best performance in detecting Chinese HWDs among the five classification models; and (3) the performance of the SVM model, especially its sensitivity, could be significantly improved by employing the Synthetic Minority Oversampling Technique to handle the class-imbalanced data. This study gains new insights into which handwriting features are predictive of Chinese HWDs in children and proposes a method that can help the clinical and educational professionals to automatically detect children at risk of Chinese HWDs.

    Download PDF (515K)
  • Siyang YU, Kazuaki KONDO, Yuichi NAKAMURA, Takayuki NAKAJIMA, Hiroaki ...
    Type: PAPER
    Subject area: Educational Technology
    2019 Volume E102.D Issue 1 Pages 156-164
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Pronunciation is a fundamental factor in speaking and listening. However, instructions for important articulation have not been sufficiently provided in conventional computer-assisted language learning (CALL) systems. One typical case is the articulation of rounded vowels. Although lip protrusion is essential for their correct pronunciation, the perception of lip protrusion is often difficult for beginners. To tackle this issue, we propose an innovative method that will provide a comprehensive visual explanation for articulation. Lip movements are three-dimensionally measured, and face images or videos are pseudocoloured on the basis of the movements. The coloured regions represent the lip protrusion of rounded vowels. To verify the learning effect of the proposed method, we conducted experiments with Japanese undergraduates in Chinese classes. The results showed that our method has advantages over conventional video materials.

    Download PDF (1263K)
  • Takuya KAMITANI, Hiroki YOSHIMURA, Masashi NISHIYAMA, Yoshio IWAI
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2019 Volume E102.D Issue 1 Pages 165-174
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    We propose a method for accurately identifying people using temporal and spatial changes in local movements measured from video sequences of body sway. Existing methods identify people using gait features that mainly represent the large swinging of the limbs. The use of gait features introduces a problem in that the identification performance decreases when people stop walking and maintain an upright posture. To extract informative features, our method measures small swings of the body, referred to as body sway. We extract the power spectral density as a feature from local body sway movements by dividing the body into regions. To evaluate the identification performance using our method, we collected three original video datasets of body sway sequences. The first dataset contained a large number of participants in an upright posture. The second dataset included variation over the long term. The third dataset represented body sway in different postures. The results on the datasets confirmed that our method using local movements measured from body sway can extract informative features for identification.

    Download PDF (1541K)
  • Chenggang GUO, Dongyi CHEN, Zhiqi HUANG
    Type: PAPER
    Subject area: Image Recognition, Computer Vision
    2019 Volume E102.D Issue 1 Pages 175-184
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Sparse representation has been successfully applied to visual tracking. Recent progresses in sparse tracking are mainly made within the particle filter framework. However, most sparse trackers need to extract complex feature representations for each particle in the limited sample space, leading to expensive computation cost and yielding inferior tracking performance. To deal with the above issues, we propose a novel sparse tracking method based on the circulant reverse lasso model. Benefiting from the properties of circulant matrices, densely sampled target candidates are implicitly generated by cyclically shifting the base feature descriptors, and then embedded into a reverse sparse reconstruction model as a dictionary to encode a robust appearance template. The alternating direction method of multipliers is employed for solving the reverse sparse model and the optimization process can be efficiently solved in the frequency domain, which enables the proposed tracker to run in real-time. The calculated sparse coefficient map represents the similarity scores between the template and circular shifted samples. Thus the target location can be directly predicted according to the coordinates of the peak coefficient. A scale-aware template updating strategy is combined with the correlation filter template learning to take into account both appearance deformations and scale variations. Both quantitative and qualitative evaluations on two challenging tracking benchmarks demonstrate that the proposed algorithm performs favorably against several state-of-the-art sparse representation based tracking methods.

    Download PDF (4186K)
  • Yotaro FUSE, Hiroshi TAKENOUCHI, Masataka TOKUMARU
    Type: PAPER
    Subject area: Kansei Information Processing, Affective Information Processing
    2019 Volume E102.D Issue 1 Pages 185-194
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Herein, we proposed a robot model that will obey a norm of a certain group by interacting with the group members. Using this model, a robot system learns the norm of the group as a group member itself. The people with individual differences form a group and a characteristic norm that reflects the group members' personalities. When robots join a group that includes humans, the robots need to obey a characteristic norm: a group norm. We investigated whether the robot system generates a decision-making criterion to obey group norms by learning from interactions through reinforcement learning. In this experiment, human group members and the robot system answer same easy quizzes that could have several vague answers. When the group members answered differently from one another at first, we investigated whether the group members answered the quizzes while considering the group norm. To avoid bias toward the system's answers, one of the participants in a group only obeys the system, whereas the other participants are unaware of the system. Our experiments revealed that the group comprising the participants and the robot system forms group norms. The proposed model enables a social robot to make decisions socially in order to adjust their behaviors to common sense not only in a large human society but also in partial human groups, e.g., local communities. Therefore, we presumed that these robots can join human groups by interacting with its members. To adapt to these groups, these robots adjust their own behaviors. However, further studies are required to reveal whether the robots' answers affect people and whether the participants can form a group norm based on a robot's answer even in a situation wherein the participants recognize that they are interacting in a group that include a real robot. Moreover, some participants in a group do not know that the other participant only obeys the system's decisions and pretends to answer questions to prevent biased answers.

    Download PDF (1523K)
  • Hyun-Chul YI, Joon-Young CHOI
    Type: LETTER
    Subject area: Software System
    2019 Volume E102.D Issue 1 Pages 195-197
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    We improve the cycle time performance of EtherCAT networks with embedded Linux-based master by developing a Linux Ethernet driver optimized for EtherCAT operation. The Ethernet driver is developed to establish a direct interface between the master module and Ethernet controllers of embedded systems by removing the involvement of Linux network stack and the New API (NAPI) of standard Ethernet drivers. Consequently, it is achieved that the time-consuming memory copy operations are reduced and the process of EtherCAT frames is accelerated. In order to demonstrate the effect of the developed Ethernet driver, we set up EtherCAT networks composed of an embedded Linux-based master and commercial off-the-shelf slaves, and the experimental results confirm that the cycle time performance is significantly improved.

    Download PDF (359K)
  • Yan SUN, Guorui FENG, Yanli REN
    Type: LETTER
    Subject area: Information Network
    2019 Volume E102.D Issue 1 Pages 198-201
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    In this paper, we propose a novel algorithm called multi-projection ensemble discriminant clustering (MPEDC) for JPEG steganalysis. The scheme makes use of the optimal projection of linear discriminant analysis (LDA) algorithm to get more projection vectors by using the micro-rotation method. These vectors are similar to the optimal vector. MPEDC combines unsupervised K-means algorithm to make a comprehensive decision classification adaptively. The power of the proposed method is demonstrated on three steganographic methods with three feature extraction methods. Experimental results show that the accuracy can be improved using iterative discriminant classification.

    Download PDF (391K)
  • Yilu MA, Zhihui YE, Yuehua LI
    Type: LETTER
    Subject area: Pattern Recognition
    2019 Volume E102.D Issue 1 Pages 202-205
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Conventional target recognition methods usually suffer from information-loss and target-aspect sensitivity when applied to radar high resolution range profile (HRRP) recognition. Thus, Effective establishment of robust and discriminatory feature representation has a significant performance improvement of practical radar applications. In this work, we present a novel feature extraction method, based on modified collaborative auto-encoder, for millimeter-wave radar HRRP recognition. The latent frame-specific weight vector is trained for samples in a frame, which contributes to retaining local information for different targets. Experimental results demonstrate that the proposed algorithm obtains higher target recognition accuracy than conventional target recognition algorithms.

    Download PDF (494K)
  • Tie HONG, Yuan Wei LI, Zhi Ying WANG
    Type: LETTER
    Subject area: Pattern Recognition
    2019 Volume E102.D Issue 1 Pages 206-209
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Head action recognition, as a specific problem in action recognition, has been studied in this paper. Different from most existing researches, our head action recognition problem is specifically defined for the requirement of some practical applications. Based on our definition, we build a corresponding head action dataset which contains many challenging cases. For action recognition, we proposed a real-time head action recognition framework based on HOF and ELM. The framework consists of face detection based ROI determination, HOF feature extraction in ROI, and ELM based action prediction. Experiments show that our method achieves good accuracy and is efficient enough for practical applications.

    Download PDF (818K)
  • Jaihyun PARK, Bonhwa KU, Youngsaeng JIN, Hanseok KO
    Type: LETTER
    Subject area: Image Processing and Video Processing
    2019 Volume E102.D Issue 1 Pages 210-213
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Side scan sonar using low frequency can quickly search a wide range, but the images acquired are of low quality. The image super resolution (SR) method can mitigate this problem. The SR method typically uses sparse coding, but accurately estimating sparse coefficients incurs substantial computational costs. To reduce processing time, we propose a region-selective sparse coding based SR system that emphasizes object regions. In particular, the region that contains interesting objects is detected for side scan sonar based underwater images so that the subsequent sparse coding based SR process can be selectively applied. Effectiveness of the proposed method is verified by the reduced processing time required for image reconstruction yet preserving the same level of visual quality as conventional methods.

    Download PDF (1053K)
  • Ruibin GUO, Dongxiang ZHOU, Keju PENG, Yunhui LIU
    Type: LETTER
    Subject area: Image Recognition, Computer Vision
    2019 Volume E102.D Issue 1 Pages 214-218
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    Pose estimation is a basic requirement for the autonomous behavior of robots. In this article we present a robust and fast visual odometry method to obtain camera poses by using RGB-D images. We first propose a motion estimation method based on sparse geometric constraint and derive the analytic Jacobian of the geometric cost function to improve the convergence performance, then we use our motion estimation method to replace the tracking thread in ORB-SLAM for improving its runtime performance. Experimental results show that our method is twice faster than ORB-SLAM while keeping the similar accuracy.

    Download PDF (1336K)
  • Jun OU, Yujian LI
    Type: LETTER
    Subject area: Biocybernetics, Neurocomputing
    2019 Volume E102.D Issue 1 Pages 219-222
    Published: January 01, 2019
    Released: January 01, 2019
    JOURNALS FREE ACCESS

    It is a hot issue that speeding up the network layers and decreasing the network parameters in convolutional neural networks (CNNs). In this paper, we propose a novel method, namely, symmetric decomposition of convolution kernels (SDKs). It symmetrically separates k×k convolution kernels into (k×1 and 1×k) or (1×k and k×1) kernels. We conduct the comparison experiments of the network models designed by SDKs on MNIST and CIFAR-10 datasets. Compared with the corresponding CNNs, we obtain good recognition performance, with 1.1×-1.5× speedup and more than 30% reduction of network parameters. The experimental results indicate our method is useful and effective for CNNs in practice, in terms of speedup performance and reduction of parameters.

    Download PDF (1145K)
feedback
Top