-
Masao Yamanaka, Masakazu Matsugu, Masashi Sugiyama
2013 Volume 8 Issue 4 Pages
929-936
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Detection of salient objects in images has been an active area of research in the computer vision community. However, existing approaches tend to perform poorly in noisy environments because probability density estimation involved in the evaluation of visual saliency is not reliable. Recently, a novel machine learning approach that directly estimates the ratio of probability densities was demonstrated to be a promising alternative to density estimation. In this paper, we propose a salient object detection method based on direct density-ratio estimation, and demonstrate its usefulness in experiments.
View full abstract
-
Masao Yamanaka, Masakazu Matsugu, Masashi Sugiyama
2013 Volume 8 Issue 4 Pages
937-943
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
We propose a method of unsupervised event detection from a video that compares probability distributions of past and current video sequence data in a sequential and hierarchical way. Because estimation of probability distributions is known to be difficult, naively comparing probability distributions via probability distribution estimation tends to be unreliable in practice. To cope with this problem, we use the state-of-the-art machine learning technique called
density ratio estimation: The ratio of probability densities is directly estimated without density estimation, and thus probability distributions can be compared in a reliable way. Through experiments on a walking scene and a tennis match, we demonstrate the usefulness of the proposed approach.
View full abstract
-
Sorn Jarukasemratana, Tsuyoshi Murata
2013 Volume 8 Issue 4 Pages
944-960
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Large graph visualization tools are important instruments for researchers to understand large graph data sets. Currently there are many tools available for download and use under free license, others in research papers or journals, each with its own functionalities and capabilities. This review focuses on giving an introduction to those large graph visualization tools and emphasizes their advantages over other tools. Criteria for selection of the tools being reviewed are it was recently published (2009 or later), or a new version was released during the last two years. The tools being reviewed in this paper are igraph, Gephi, Cytoscape, Tulip, WiGis, CGV, VisANT, Pajek, In Situ Framework, Honeycomb and two visualization toolkits which are JavaScript InfoVis Toolkit and GraphGL. The last part of the review presents our suggestion on building large graph visualization platform based on advantages of tools and toolkits that are being reviewed.
View full abstract
-
Tomohisa Egawa, Naoki Nishimura, Kenichi Kourai
2013 Volume 8 Issue 4 Pages
961-970
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
In Infrastructure-as-a-Service (IaaS) clouds, the users manage the systems in the provided virtual machines (VMs) called
user VMs through remote management software such as Virtual Network Computing (VNC). For dependability, they often perform out-of-band remote management via the
management VM. Even in the case of system failures inside their VMs, the users could directly access their systems. However, the management VM is not always trustworthy in IaaS. Once outside or inside attackers intrude into the management VM, they could easily eavesdrop on all the inputs and outputs in remote management. To solve this security issue, this paper proposes
FBCrypt for preventing information leakage via the management VM in out-of-band remote management. FBCrypt encrypts the inputs and outputs between a VNC client and a user VM using the
virtual machine monitor (
VMM). Sensitive information is protected against the management VM between them. The VMM intercepts the reads of virtual devices by a user VM and decrypts the inputs, whereas it intercepts the updates of a framebuffer by a user VM and encrypts the pixel data. We have implemented FBCrypt for para-virtualized and fully-virtualized guest operating systems in Xen and TightVNC. Then we confirmed that any keystrokes or pixel data did not leak.
View full abstract
-
Satoshi Yoshida, Takuya Kida
2013 Volume 8 Issue 4 Pages
971-977
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
In this study, we address the problem of improving variable-length-to-fixed-length codes (VF codes). A VF code is an encoding scheme that uses a fixed-length code, which provides easy access to compressed data. However, conventional VF codes generally have an inferior compression ratio compared with variable-length codes. A method proposed by Uemura et al. in 2010 delivered a good compression ratio that was comparable with that of gzip, but it was very time consuming. In this study, we propose a new VF coding method that applies a fixed-length code to a set of rules extracted using the Re-Pair algorithm, which was proposed by Larsson and Moffat in 1999. The Re-Pair algorithm is a simple offline grammar-based compression method, which has good compression-ratio performance with moderate compression speed. We also present experimental results, which demonstrates that our proposed coding method is superior to the existing VF coding method.
View full abstract
-
Kouya Tochikubo
2013 Volume 8 Issue 4 Pages
978-986
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
We propose new construction methods of secret sharing schemes realizing general access structures. Our proposed construction methods are perfect secret sharing schemes and include Shamir's (
k,
n)-threshold schemes as a special case. Furthermore, except for some access structures for which the efficiency is the same as the previous ones, the proposed construction methods are more efficient than Benaloh and Leichter's scheme and the scheme I of TUM05.
View full abstract
-
Takanori Isobe, Toshihiro Ohigashi, Masakatu Morii
2013 Volume 8 Issue 4 Pages
987-994
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper gives a first security evaluation of a lightweight stream cipher RAKAPOSHI. In particular, we analyze a slide property of RAKAPOSHI such that two different Key-IV pairs generate the same keystream but
n-bit shifted. To begin with, we demonstrate that any Key-IV pair has a corresponding
slide Key-IV pair that generates an
n-bit shifted keystream with a probability of 2
-2n. In order to experimentally support our results, some examples of such pairs are given. Then, we show that this property is able to be converted into key recovery attacks on RAKAPOSHI. In the related-key setting, our attack based on the slide property can recover a 128-bit key with a time complexity of 2
41 and 2
38 chosen IVs. Moreover, by using a variant of slide property called partial slide pair, this attack is further improved, and then a 128-bit key can be recovered with a time complexity of 2
33 and 2
30 chosen IVs. Finally, we present a method for speeding up the brute force attack by a factor of 2 in the single key setting.
View full abstract
-
Ning Li, Yuki Kinebuchi, Hiromasa Shimada, Tatsuo Nakajima
2013 Volume 8 Issue 4 Pages
995-1004
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Some recent researches have shown that using a monitoring service outside the target system above hypervisors is an efficient way to protect the target system. The hypervisors isolate the monitoring service based on MMU-methods to improve security. However, The MMU-method may cause heavy overhead when there is no hardware support, which makes this method not viable for embedded processors that are rarely equipped with hardware virtualization extensions. In addition, the vulnerabilities that exist in hypervisors may compromise the isolation. In this paper, we propose a secure OS architecture that fits embedded systems without the dependency of a hypervisor. It provides a robust isolation between the monitoring service and the guest OS based on local memory, a hardware feature. In order to generalize this architecture, we adopt a secure pager to extend the local memory space (physically small) virtually by a swap mechanism with integrity checking of the monitoring service. The secure pager can also update the monitoring service to extend monitoring functions without disturbing the running of the guest OS. Comprehensive evaluations are made in our framework with one instance of embedded Linux as the guest OS and an isolated monitoring service running with the secure pager. The results demonstrate functions of the secure pager and influence of the secure pager on Linux in our system. On processors with a proper architecture, we can build an extensible secure OS architecture with reasonable resource consumption, without the issue of heavy overhead to the guest OS.
View full abstract
-
Masashi Saito, Shin-ichi Nakano
2013 Volume 8 Issue 4 Pages
1005-1009
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
A rectangular drawing is a partition of a rectangle into a set of rectangles. Rectangular drawings have many important applications including VLSI layout. Since the size of rectangular drawings may be huge, compact encodings are desired. Several compact encodings of rectangular drawings without degree four vertices are known. In this paper, we design two compact encodings for rectangular drawings with degree four vertices. We give 5
f -
B -
n4 bits and 5
f -
B -
W - 3 bits encodings for rectangular drawings, where
f is the number of inner faces,
n4 is the number of vertices with degree four, and
B (resp.
W) is the number of inner faces touching the bottommost horizontal (resp. rightmost vertical) line segments.
View full abstract
-
Makoto Fujisawa, Yojiro Mandachi, Kenjiro T. Miura
2013 Volume 8 Issue 4 Pages
1010-1016
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper presents an accurate method for computing the surface velocity which is used to advect the vertex in mesh-based surface tracking. We propose a curvature invariance condition that accurately captures the movement of a surface, especially in the case of rotating objects. The method uses the least-squares method and mesh fairing to solve the problem that the surface velocity would not be calculated when the implicit function defining the surface does not change. We show that the method works well in scenes including rotation and deformation.
View full abstract
-
Masaaki Nishino, Norihito Yasuda, Tsutomu Hirao, Jun Suzuki, Masaaki N ...
2013 Volume 8 Issue 4 Pages
1017-1025
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Multi-document summarization is the task of generating a summary from multiple documents, and the generated summary is expected to contain much of the information contained in the original documents. Previous work tries to realize this by (i) formulating the task as the combinatorial optimization problem of simultaneously maximizing relevance and minimizing redundancy, or (ii) formulating the task as a graph-cut problem. This paper improves summary quality by combining these two approaches into a synthesized optimization problem that is formulated in Integer Linear Programming (ILP). Though an ILP problem can be solved with an ILP solver, the problem is NP-hard and it is difficult to obtain the exact solution in situations where immediate responses are needed. Our solution is to propose optimization heuristics that exploit Lagrangian relaxation to obtain good approximate solutions within feasible computation times. Experiments on the document understanding conference 2004 (DUC'04) dataset show that our Lagrangian relaxation based heuristics completes in feasible computation time but achieves higher ROUGE scores than state-of-the-art approximate methods.
View full abstract
-
Daigo Muramatsu, Yasushi Makihara, Yasushi Yagi
2013 Volume 8 Issue 4 Pages
1026-1030
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
We focus on gait recognition for criminal investigation. In criminal investigation, person authentication is performed by comparing target data at the crime scene and multiple gait data with slightly different views from that of the target data. For this task, we propose fusion of direct cross-view matching. Cross-view matching generally produces worse result than those of same-view matching when view-variant features are used. However, the correlation between cross-view matching with different view pairs is low and it provides improved accuracy. Experimental results performed utilizing large-scale dataset under settings resembling actual criminal investigation cases, show that the proposed approach works well.
View full abstract
-
Ikuro Sato, Mitsuru Ambai, Koichiro Suzuki
2013 Volume 8 Issue 4 Pages
1031-1035
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper address the problem of binary coding of real vectors for efficient similarity computations. It has been argued that orthogonal transformation of center-subtracted vectors followed by sign function produces binary codes which well preserve similarities in the original space, especially when orthogonally transformed vectors have covariance matrix with equal diagonal elements. We propose a simple hashing algorithm that can orthogonally transform an arbitrary covariance matrix to the one with equal diagonal elements. We further expand this method to make the projection matrix sparse, which yield faster coding. It is demonstrated that proposed methods have comparable level of similarity preservation to the existing methods.
View full abstract
-
Kakeru Wakimoto, Yasushi Kanazawa, Naoya Ohta
2013 Volume 8 Issue 4 Pages
1036-1040
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
We present a method for enhancing the color recognition ability of dichromats. Whereas trichromats (usual people) recognize all colors in a 3-D color space, dichromats only recognize colors on a degenerate 2-D space in it. Our method compensates for the lost information along the degenerate direction in the color space with the amount of noise in the image. Dichromats recognize the lost color information as noisy textures, while the original color information for trichromats is preserved. Our method is applicable not only to artificial figures such as graphs but also to natural photographs. We show the effectiveness of our method by experiments.
View full abstract
-
Masayuki Tanaka, Akihiko Torii, Masatoshi Okutomi
2013 Volume 8 Issue 4 Pages
1041-1045
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
In image retrieval applications, the Fisher vector of the Gaussian mixture model (GMM) with a diagonal-covariance structure is known as a powerful tool to describe an image by aggregating local descriptors extracted from the image. In this paper, we propose the Fisher vector of the GMM with a full-covariance structure. The closed-form approximation of the GMM with a full-covariance structure is derived. Our observation is that the Fisher vector of a higher dimensional GMM yields higher image retrieval performance. The Fisher vector for the GMM with a block-diagonal-covariance structure is also introduced to provide moderate dimensionality for the GMM. Experimental comparisons performed using two major datasets demonstrate that the proposed Fisher vector outperforms state-of-the-art algorithms.
View full abstract
-
Cuicui Zhang, Xuefeng Liang, Takashi Matsuyama
2013 Volume 8 Issue 4 Pages
1046-1050
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Motion estimation and segmentation poses challenges in dynamic scenarios where multiple motions are mixed up and interdependent. However, existing approaches in 2D motion field usually require the mixed motions to be independent. Algorithms incorporating 3D information have proven to be superior to purely 2D approaches in many studies. Inspired by this idea, we propose a new algorithm for evolving 3D potential surfaces using Helmholtz decomposition to represent 2D motion field. Meanwhile, a surface segmentation scheme is introduced to put different motions onto different layers, so that those interdependent motions can be separated and recovered efficiently. Unlike other approaches, our method does not require the prior knowledge of the motion model. The performance is demonstrated using real data under various complex scenarios.
View full abstract
-
Shigeki Sugimoto, Takaaki Kato, Kouma Motooka, Masatoshi Okutomi
2013 Volume 8 Issue 4 Pages
1051-1055
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
We propose a method for directly estimating a square grid ground surface from stereo images. We estimate the heights of all vertices in a square mesh, in which each square is divided into two triangular patches, drawn on a level plane of the ground, from a pair of images captured by nearly front-looking stereo cameras. We formulate a data term, representing the sum of the squared differences of photometrically transformed pixel values in homography-related projective triangular patches between the two stereo images, by the inverse compositional trick for both surface and photometric parameters for realizing an efficient estimation algorithm. The main difficulty of this problem formulation lies in the estimation instability for the heights of the distant vertices from the cameras, since the image projections of the distant triangular patches are crushed in the images. We effectively improve the stability by the combinational use of an additional smoothness term, update constraint term, and a hierarchical meshing approach. We demonstrate the validity of the proposed method through experiments using real images, and the usability for mobile robots by showing traversable area detection results on the ground surfaces estimated by the proposed method.
View full abstract
-
Tomohiko Yano, Shohei Nobuhara, Takashi Matsuyama
2013 Volume 8 Issue 4 Pages
1056-1060
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper is aimed at presenting a new algorithm for full 3D shape reconstruction and online free-viewpoint rendering of objects in water. The key contributions are (1) a new calibration model for the refractive projection, and (2) a new 3D shape reconstruction algorithm based on shape-from-silhouette (SfS) concept. We also propose an online free-viewpoint rendering system as a practical application.
View full abstract
-
Hidetoshi Goto, Hidekata Hontani
2013 Volume 8 Issue 4 Pages
1061-1065
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
We propose a new method that efficiently and accurately estimates the parameters of the Gaussian function that describes the given local image profiles. The Gaussian function is non-linear with respect to the parameters to be estimated, and this non-linearity makes their efficient and accurate estimation difficult. In our proposed method, the weighted integral method is introduced to linearize the parameter estimation problem: A system of differential equations is firstly derived that is satisfied by the Gaussian function and that is linear with respect to the parameters. The system is then converted to that of integral equations. Given a local sub-window of the image, one can obtain the system of integral equations and estimate the parameters of the Gaussian that describe the appearance in the sub-window by solving the linear system of the parameters. Experimental results showed that our proposed method estimates the parameters more efficiently and accurately than existing state-of-the-art methods.
View full abstract
-
Ryo Yonetani, Hiroaki Kawashima, Takashi Matsuyama
2013 Volume 8 Issue 4 Pages
1066-1070
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
When we are watching videos, there are spatiotemporal gaps between where we look (points of gaze) and what we focus on (points of attentional focus), which result from temporally delayed responses or anticipation in eye movements. We focus on the underlying structure of those gaps and propose a novel learning-based model to predict where humans look in videos. The proposed model selects a relevant point of focus in the spatiotemporal neighborhood around a point of gaze, and jointly learns its salience and spatiotemporal gap with the point of gaze. It tells us “this point is likely to be looked at because there is a point of focus around the point with a reasonable spatiotemporal gap.” Experimental results with a public dataset demonstrate the effectiveness of the model to predict the points of gaze by learning a particular structure of gaps with respect to the types of eye movements and those of salient motions in videos.
View full abstract
-
Bo Zheng, Yongqi Sun, Jun Takamatsu, Katsushi Ikeuchi
2013 Volume 8 Issue 4 Pages
1071-1075
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
In this paper, we propose a novel local image descriptor DoP which is termed as the difference of images represented by polynomials in different degrees. Once an interest point/region is extracted by a common image detector such as Harris corner, our DoP descriptor is able to characterize the interest point/region with high distinctiveness, compactness, and robustness to viewpoint change, image blur, and illumination variation. To efficiently build DoP descriptor, we propose to numerically reduce the computational cost by jumping over the repeatedly calculating polynomial representation. Our experimental results demonstrate a better performance compared to several state-of-art candidates.
View full abstract
-
Tsuyoshi Kato, Wataru Takei, Shinichiro Omachi
2013 Volume 8 Issue 4 Pages
1076-1080
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Face recognition is a multi-class classification problem that has long attracted many researchers in the community of image analysis. We consider using the Mahalanobis distance for the task. Classically, the inverse of a covariance matrix has been chosen as the Mahalanobis matrix, a parameter of the Mahalanobis distance. Modern studies often employ machine learning algorithms called metric learning to determine the Mahalanobis matrix so that the distance is more discriminative, although they resort to eigen-decomposition requiring heavy computation. This paper presents a new metric learning algorithm that finds discriminative Mahalanobis matrices efficiently without eigen-decomposition, and shows promising experimental results on real-world face-image datasets.
View full abstract
-
Min Lu, Bo Zheng, Jun Takamatsu, Katsushi Ikeuchi
2013 Volume 8 Issue 4 Pages
1081-1084
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
In this paper, we propose a novel method for comparing the shape of similar objects. From the viewpoint of linear algebra, we turn this identifiable region detection problem into a low-rank submatrices searching process, and solve it with biclustering. Comparing with traditional cluster analysis, our method looks for structural information on both object index and local shape dimensions, which leads to more detailed local comparison results. The proposed method is evaluated with real world data with satisfactory results, which verifies the effectiveness of our method.
View full abstract
-
Hozuma Nakajima, Ikuhisa Mitsugami, Yasushi Yagi
2013 Volume 8 Issue 4 Pages
1085-1089
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper proposes a novel gait feature representation that well describes characteristics of a walking person from the perspective of a range sensor. Most existing methods for gait feature extraction use a sequence of his/her silhouette as their input, so that they inevitably suffer from the difficulty of silhouette extraction in real scenes and change of view direction, which prevent them from being applied in practice. The proposed method, on the other hand, does not require such accurate segmentation, and is not affected by view change since captured range data has three-dimensional information. In addition, our method can explicitly separate dynamic feature from a static one, e.g., body shape, which have never been realized. Experimental results of gait authentication show its effectiveness.
View full abstract
-
Bo Zheng, Ryo Ishikawa, Jun Takamatsu, Katsushi Ikeuchi, Takaaki Endo, ...
2013 Volume 8 Issue 4 Pages
1090-1094
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
MR Image fusion is desired in various image-guide breast surgeries. However it often suffers from the difficulty on dealing with large deformation of breast. This paper presents a novel method for efficiently modeling and inferring the physical parameters, including gravity, Young's modulus, Poisson's ratio, etc, which are important elements for handling the biomechanical deformations of breast with finite element model. Our method consists of two major steps: 1) deformation modeling and 2) non-rigid registration. The former builds a deformable implicit polynomial (DIP) model to encode the physical parameters according to deformation. The latter fast registers the prior DIP to the online breast image such that the image fusion can be achieved. Experimental results demonstrate the good performance of our method.
View full abstract
-
Kenji Inose, Shota Shimizu, Rei Kawakami, Yasuhiro Mukaigawa, Katsushi ...
2013 Volume 8 Issue 4 Pages
1095-1099
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
We propose an outdoor photometric stereo method, which considers environmental lighting for improving the performance of surface normal estimation. In the previous methods, the sky illumination effect has been either assumed to be constant throughout the scene, or to be removed by pre-processing. This paper exploits a sky model which can derive the entire sky luminance from a sky zenith image; then, sky illumination effect on a surface can be correctly calculated by iteratively refining its normal direction. This paper also extends a two-source photometric stereo method by introducing RANSAC, so that the input images of this method can be taken in a day. Experimental results with real outdoor objects show the effectiveness of the method.
View full abstract
-
Asad Ali, Imari Sato, Takahiro Okabe, Yoichi Sato
2013 Volume 8 Issue 4 Pages
1100-1104
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
In this work we propose a novel method for modeling and synthesizing objects appearance based on planned sampling. The proposed method can efficiently model the BRDF of an object with uniform and isotropic reflectance using a small number of light source directions. This is achieved by utilizing together the knowledge of the object's shape along with the statistics of various BRDFs. The method considers the shape of the object, compact basis representing variations in a reflectance dataset, a fixed view direction and all possible light source directions around the object. Then using an iterative optimization process which simulates the contribution of each light source in modeling the object appearance, our method identifies the most suitable set of light source directions for efficiently modeling the BRDF of the object's material. The selected light sources are then used to acquire actual images of the object for recovering its reflectance properties. Experiments conducted using several objects with varying shapes and a small number of light sources optimally selected by the method validate the effectiveness of the proposed approach in modeling object appearance.
View full abstract
-
Chika Inoshita, Seiichi Tagawa, Md. Abdul Mannan, Yasuhiro Mukaigawa, ...
2013 Volume 8 Issue 4 Pages
1105-1109
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Full-dimensional (8-D) BSSRDF completely expresses the various light interactions on object surface such as reflection and subsurface scattering. However, it is difficult to sample full-dimensional BSSRDF because it requires a lot of illuminations and observations from every direction. There are many researches which approximated BSSRDF as a low-dimensional function by only considering the medium as homogeneous or assuming isotropic scattering. Therefore, in this paper, we show a novel sampling and analyzing method for full-dimensional BSSRDF in real scenes. We sample this full-dimensional BSSRDF using a polyhedral mirror system to place a lot of virtual cameras and projectors. In addition, we propose a method of decomposition of BSSRDF into isotropic and anisotropic components for scattering analysis. We show the empirical characteristics of subsurface scattering inside a real medium by analyzing sampled full-dimensional BSSRDF.
View full abstract
-
Yuji Yamauchi, Mitsuru Ambai, Ikuro Sato, Yuichi Yoshida, Hironobu Fuj ...
2013 Volume 8 Issue 4 Pages
1110-1114
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Image recognition in client server system has a problem of data traffic. However, reducing data traffic gives rise to worsening of performance. Therefore, we represent binary codes as high dimensional local features in client side, and represent real vectors in server side. As a result, we can suppress the worsening of the performance, but it problems of an increase in the computational cost of the distance computation and a different scale of norm between feature vectors. Therefore, to solve the first problem, we optimize the scale factor so as to absorb the scale difference of Euclidean norm. For second problem, we compute efficiently the Euclidean distance by decomposing the real vector into weight factors and binary basis vectors. As a result, the proposed method achieves the keypoint matching with high-speed and high-precision even if the data traffic was reduced.
View full abstract
-
Atsushi Shimada, Hajime Nagahara, Rin-ichiro Taniguchi
2013 Volume 8 Issue 4 Pages
1115-1119
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper discusses about object detection based on spatio-temporal light field sensing. Our proposed method generates an arbitrary in-focus plane in the surveillance scene, and the background region can be filtered out by out-focusing. A new feature representation, called Local Ray Pattern (LRP), is introduced to evaluate the spatial consistency of light rays. The combination of LRP and GMM-based background modeling realizes object detection on the in-focus plane. Experimental results demonstrate the effectiveness and applicability for video surveillance.
View full abstract
-
Shohei Noguchi, Yoshihiro Watanabe, Masatoshi Ishikawa
2013 Volume 8 Issue 4 Pages
1120-1129
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Sensing the 3D shape of a dynamic scene is not a trivial problem, but it is useful for various applications. Recently, sensing systems have been improved and are now capable of high sampling rates. However, particularly for dynamic scenes, there is a limit to improving the resolution at high sampling rates. In this paper, we present a method for improving the resolution of a 3D shape reconstructed from multiple range images acquired from a moving target. In our approach, the alignment and surface estimation problems are solved in a simultaneous estimation framework. Together with the use of an adaptive multi-level implicit surface for shape representation, this allows us to calculate the alignment by using shape features and surface estimation according to the amount of movement of the point clouds for each range image. By doing so, this approach realized simultaneous estimation more precisely than a scheme involving mere alternating estimation of shape and alignment. We present results of experiments for evaluating the reconstruction accuracy with different point cloud densities and noise levels.
View full abstract
-
Yusuke Uchida, Shigeyuki Sakazawa
2013 Volume 8 Issue 4 Pages
1130-1139
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
In this paper, we propose a new, effective, and unified scoring method for local feature-based image retrieval. The proposed scoring method is derived by solving the large-scale image retrieval problem as a classification problem with a large number of classes. The resulting proposed score is based on the ratio of the probability density function of an object model to that of a background model, which is efficiently calculated via nearest neighbor density estimation. The proposed method has the following desirable properties: (1) has a sound theoretical basis, (2) is more effective than inverse document frequency-based scoring, (3) is applicable not only to quantized descriptors but also to raw descriptors, and (4) is easy and efficient in terms of calculation and updating. We show the effectiveness of the proposed method empirically by applying it to a standard and improved bag-of-visual words-based framework and a
k-nearest neighbor voting framework.
View full abstract
-
Shaowei Chu, Jiro Tanaka
2013 Volume 8 Issue 4 Pages
1140-1153
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Most existing digital camera user interfaces place little emphasis on self-portrait options. Therefore, it is not always easy to take self-portraits using conventional user interfaces. This paper presents a vision-based head gesture interface for controlling a self-portrait camera that helps users to take self-portraits effectively and efficiently. Intuitive
nodding and
head-shaking gestures control the camera zoom in/out on the face, and a
mouth-opening gesture triggers the camera to take a picture. We evaluated its usability factors (effectiveness, efficiency, and satisfaction) and compared it to a remote control in a user study. The results suggest that our interface is useful for taking self-portrait pictures.
View full abstract
-
Kazuya Murao, Tsutomu Terada, Ai Yano, Ryuichi Matsukura
2013 Volume 8 Issue 4 Pages
1154-1165
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Mobile phones and video game controllers using gesture recognition technologies enable easy and intuitive operations, such as scrolling browser and drawing objects. However, usually only one of each kind of sensor is installed in a device, and the effect of multiple homogeneous sensors on recognition accuracy has not been investigated. Moreover, the effect of the differences in the motion of a gesture has not been examined. We have investigated the use of a test mobile device with nine accelerometers and nine gyroscopes. We captured data for 27 kinds of gestures for a mobile tablet. We experimentally investigated the effects on recognition accuracy of changing the number and positions of sensors and the number and kinds of gestures.
View full abstract
-
Michael Paul, Andrew Finch, Eiichiro Sumita
2013 Volume 8 Issue 4 Pages
1166-1186
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
Recent research on multilingual statistical machine translation (SMT) focuses on the usage of
pivot languages in order to overcome resource limitations for certain language pairs. This paper proposes a new method to translate a
dialect language into a foreign language by integrating transliteration approaches based on Bayesian alignment (BA) models with pivot-based SMT approaches. The advantages of the proposed method with respect to standard SMT approaches are threefold: (1) it uses a standard language as the pivot language and acquires knowledge about the relation between dialects and a standard language automatically, (2) it avoids segmentation mismatches between the input and the translation model by mapping the character sequences of the dialect language to the word segmentation of the standard language, and (3) it reduces the translation task complexity by using monotone decoding techniques. Experiment results translating five Japanese dialects (Kumamoto, Kyoto, Nagoya, Okinawa, Osaka) into four Indo-European languages (English, German, Russian, Hindi) and two Asian languages (Chinese, Korean) revealed that the proposed method improves the translation quality of dialect translation tasks and outperforms standard pivot translation approaches concatenating SMT engines for the majority of the investigated language pairs.
View full abstract
-
Haruyuki Iwama, Daigo Muramatsu, Yasushi Makihara, Yasushi Yagi
2013 Volume 8 Issue 4 Pages
1187-1199
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper describes the first gait verification system for criminal investigation using footages from surveillance cameras. The system is designed so that the criminal investigators as non-specialists on computer vision-based gait verification can, independently, use it to verify unknown perpetrators as suspects or ex-convicts in criminal investigations. Each step of the gait verification process is proceeded by interactive operation on a graphics-user interface. Eventually, for each pair of compared subjects selected by a user, the system outputs a posterior probability on a verification result, which indicates that compared subjects are the same, with the consideration of various circumstances of the subjects such as the size, frame-rate, observation views, and clothing of subjects. One gait-specialist and ten non-gait-specialists participated in operation tests of the system using five different datasets with various types of scenes, each of which contained two or three verification sets. It was shown that all the non-gait-specialists, as well as the gait-specialist, could obtain reasonable verification results for almost all of the verification sets.
View full abstract
-
Hiroyuki Ishida, Jun-ichi Meguro, Yoshiko Kojima, Takashi Naito
2013 Volume 8 Issue 4 Pages
1200-1206
Published: 2013
Released on J-STAGE: December 15, 2013
JOURNAL
FREE ACCESS
This paper presents a novel method for detecting 3D road boundaries, such as walls, guardrails, and curbs, using on-board stereo cameras. The proposed method uses conformal geometric algebra, which can describe different shapes in a common representation. 3D road boundaries on straight and curved roads are seamlessly detected by use of this representation, and this framework is also applied to curb detection by a subtle modification. Experimental results show that despite its algorithmic simplicity, the proposed method exhibited competitive detection performance compared with conventional model fitting and curb detection methods.
View full abstract