IPSJ Transactions on Computer Vision and Applications
Online ISSN : 1882-6695
ISSN-L : 1882-6695
Volume 1
Displaying 1-26 of 26 articles from this issue
  • Naokazu Yokoya, Yasushi Yagi
    Article type: Foreword
    Subject area: Foreword
    2009 Volume 1 Pages 1-2
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    Download PDF (38K)
  • Weiwei Du, Kiichi Urahama
    Article type: Special Issue on ACCV2007 Regular Paper
    Subject area: Regular Paper
    2009 Volume 1 Pages 3-11
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    We present a semi-supervised technique of object extraction for natural image matting. At first, we present a novel unsupervised graph-spectral algorithm for extraction of homogeneous regions in an image. We next derive a semi-supervised scheme from this unsupervised algorithm. In our method, it is sufficient for users to draw strokes only in one of object and background regions. The semi-supervised optimization problem is solved with an iterative method where memberships are propagated from strokes to their surroundings. We suggest a guideline for placement of strokes by exploiting the same iterative solution process in the unsupervised algorithm. We project the color vectors with the linear discriminant analysis to improve the color discriminability and speed up the convergence of the iterative method. Performance of the proposed method is examined for some images and the results are compared with other methods and ground truth mattes.
    Download PDF (1932K)
  • Holger Handel
    Article type: Special Issue on ACCV2007 Regular Paper
    Subject area: Regular Paper
    2009 Volume 1 Pages 12-20
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    This article presents an investigation of the impact of camera warm-up on the image acquisition process and therefore on the accuracy of segmented image features. Based on an experimental study we show that the camera image is shifted to an extent of some tenth of a pixel after camera start-up. The drift correlates with the temperature of the sensor board and stops when the camera reaches its thermal equilibrium. A further study of the observed image flow shows that it originates from a slight displacement of the image sensor due to thermal expansion of the mechanical components of the camera. This sensor displacement can be modeled using standard methods of projective geometry in addition with bi-exponential decay terms to model the temporal dependency. The parameters of the proposed model can be calibrated and then used to compensate warm-up effects. Further experimental studies show that our method is applicable to different types of cameras and that the warm-up behaviour is characteristic for a specific camera.
    Download PDF (2601K)
  • Yasuhiro Mukaigawa, Kohei Sumino, Yasushi Yagi
    Article type: Special Issue on ACCV2007 Regular Paper
    Subject area: Regular Paper
    2009 Volume 1 Pages 21-32
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    Measuring a bidirectional reflectance distribution function (BRDF) requires long time because a target object must be illuminated from all incident angles and the reflected light must be measured from all reflected angles. In this paper, we introduce a rapid BRDF measuring system using an ellipsoidal mirror and a projector. Since the system changes incident angles without a mechanical drive, dense BRDF can be rapidly measured. Moreover, it is shown that the S/N ratio of the measured BRDF can be significantly increased by multiplexed illumination based on the Hadamard matrix.
    Download PDF (2507K)
  • Tze Ki Koh, Amit Agrawal, Ramesh Raskar, Stephen P Morgan, Nicholas J ...
    Article type: Special Issue on ACCV2007 Regular Paper
    Subject area: Regular Paper
    2009 Volume 1 Pages 33-45
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    We present a simple and practical approach for segmenting un-occluded items in a scene by actively casting shadows. By ‘items’, we refer to objects (or part of objects) enclosed by depth edges. Our approach utilizes the fact that under varying illumination, un-occluded items will cast shadows on occluded items or background, but will not be shadowed themselves. We employ an active illumination approach by taking multiple images under different illumination directions, with illumination source close to the camera. Our approach ignores the texture edges in the scene and uses only the shadow and silhouette information to determine the occlusions. We show that such a segmentation does not require the estimation of a depth map or 3D information, which can be cumbersome, expensive and often fails due to the lack of texture and presence of specular objects in the scene. Our approach can handle complex scenes with self-shadows and specularities. In addition, we show how to identify regions belonging to occluded objects and segment the scene into multiple layers. Our approach is able to recover the shape of occluded objects if none of its depth edges are occluded. Results on several real scenes along with the analysis of failure cases are presented.
    Download PDF (2019K)
  • Raphaël Marée, Pierre Geurts, Louis Wehenkel
    Article type: Special Issue on ACCV2007 Regular Paper
    Subject area: Regular Paper
    2009 Volume 1 Pages 46-57
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    We propose a new method for content-based image retrieval which exploits the similarity measure and indexing structure of totally randomized tree ensembles induced from a set of subwindows randomly extracted from a sample of images. We also present the possibility of updating the model as new images come in, and the capability of comparing new images using a model previously constructed from a different set of images. The approach is quantitatively evaluated on various types of images and achieves high recognition rates despite its conceptual simplicity and computational efficiency.
    Download PDF (5776K)
  • Wei Du, Jean-Bearnard Hayet, Jacques Verly, Justus Piater
    Article type: Special Issue on ACCV2007 Regular Paper
    Subject area: Regular Paper
    2009 Volume 1 Pages 58-71
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    This paper presents a novel approach to tracking ground targets in multiple cameras. A target is tracked not only in each camera but also in the ground plane by individual particle filters. These particle filters collaborate in two different ways. First, the particle filters in each camera pass messages to those in the ground plane where the multi-camera information is integrated by intersecting the targets' principal axes. This largely relaxes the dependence on precise foot positions when mapping targets from images to the ground plane using homographies. Second, the fusion results in the ground plane are then incorporated by each camera as boosted proposal functions. A mixture proposal function is composed for each tracker in a camera by combining an independent transition kernel and the boosted proposal function. The general framework of our approach allows us to track individual targets distributively and independently, which is of potential use in case that we are only interested in the trajectories of a few key targets and that we cannot track all the targets in the scene simultaneously.
    Download PDF (8009K)
  • Ali Shahrokni, Oliver Woodford, Ian Reid
    Article type: Special Issue on ACCV2007 Regular Paper
    Subject area: Regular Paper
    2009 Volume 1 Pages 72-81
    Published: 2009
    Released on J-STAGE: January 30, 2009
    JOURNAL FREE ACCESS
    In this paper we propose a method to construct a virtual sequence for a camera moving through a static environment, given an input sequence from a different camera trajectory. Existing image-based rendering techniques can generate photorealistic images given a set of input views, though the output images almost unavoidably contain small regions where the colour has been incorrectly chosen. In a single image these artifacts are often hard to spot, but become more obvious when viewing a real image with its virtual stereo pair, and even more so when a sequence of novel views is generated, since the artifacts are rarely temporally consistent. To address this problem of consistency, we propose a new spatio-temporal approach to novel video synthesis. Our method exploits epipolar geometry to impose constraints on temporal coherence of the rendered views. The pixels in the output video sequence are modelled as nodes of a 3-D graph. We define an MRF on the graph which encodes photoconsistency of pixels as well as texture priors in both space and time. Unlike methods based on scene geometry, which yield highly connected graphs, our approach results in a graph whose degree is independent of scene structure. The MRF energy is therefore tractable and we solve it for the whole sequence using a state-of-the-art message passing optimisation algorithm. We demonstrate the effectiveness of our approach in reducing temporal artifacts.
    Download PDF (2954K)
  • Hideo Saito, In So Kweon
    Article type: Preface
    Subject area: Preface
    2009 Volume 1 Pages 82
    Published: 2009
    Released on J-STAGE: March 31, 2009
    JOURNAL FREE ACCESS
    Download PDF (30K)
  • Robert Pless, Richard Souvenir
    Article type: Special Issue on MIRU2008 Invited Paper
    Subject area: Invited Paper
    2009 Volume 1 Pages 83-94
    Published: 2009
    Released on J-STAGE: March 31, 2009
    JOURNAL FREE ACCESS
    Many natural image sets are samples of a low-dimensional manifold in the space of all possible images. Understanding this manifold is a key first step in understanding many sets of images, and manifold learning approaches have recently been used within many application domains, including face recognition, medical image segmentation, gait recognition and hand-written character recognition. This paper attempts to characterize the special features of manifold learning on image data sets, and to highlight the value and limitations of these approaches.
    Download PDF (938K)
  • Yu-Wing Tai, Huixuan Tang, Michael S. Brown, Stephen Lin
    Article type: Special Issue on MIRU2008 Invited Paper
    Subject area: Invited Paper
    2009 Volume 1 Pages 95-104
    Published: 2009
    Released on J-STAGE: March 31, 2009
    JOURNAL FREE ACCESS
    We presented an invited talk at the MIRU-IUW workshop on correcting photometric distortions in photographs. In this paper, we describe our work on addressing one form of this distortion, namely defocus blur. Defocus blur can lead to the loss of fine-scale scene detail, and we address the problem of recovering it. Our approach targets a single-image solution that capitalizes on redundant scene information by restoring image patches that have greater defocus blur using similar, more focused patches as exemplars. The major challenge in this approach is to produce a spatially coherent and natural result given the rather limited exemplar data present in a single image. To address this problem, we introduce a novel correction algorithm that maximizes the use of available image information and employs additional prior constraints. Unique to our approach is an exemplar-based deblurring strategy that simultaneously considers candidate patches from both sharper image regions as well as deconvolved patches from blurred regions. This not only allows more of the image to contribute to the recovery process but inherently combines synthesis and deconvolution into a single procedure. In addition, we use a top-down strategy where the pool of in-focus exemplars is progressively expanded as increasing levels of defocus are corrected. After detail recovery, regularization based on sparsity and contour continuity constraints is applied to produce a more plausible and natural result. Our method compares favorably to related techniques such as defocus inpainting and deconvolution with constraints from natural image statistics alone.
    Download PDF (1597K)
  • Alexander M. Bronstein, Michael M. Bronstein, Yair Carmon, Ron Kimmel
    Article type: Special Issue on MIRU2008 Invited Paper
    Subject area: Invited Paper
    2009 Volume 1 Pages 105-114
    Published: 2009
    Released on J-STAGE: March 31, 2009
    JOURNAL FREE ACCESS
    Partial matching of geometric structures is important in computer vision, pattern recognition and shape analysis applications. The problem consists of matching similar parts of shapes that may be dissimilar as a whole. Recently, it was proposed to consider partial similarity as a multi-criterion optimization problem trying to simultaneously maximize the similarity and the significance of the matching parts. A major challenge in that framework is providing a quantitative measure of the significance of a part of an object. Here, we define the significance of a part of a shape by its discriminative power with respect do a given shape database — that is, the uniqueness of the part. We define a point-wise significance density using a statistical weighting approach similar to the term frequency-inverse document frequency (tf-idf) weighting employed in search engines. The significance measure of a given part is obtained by integrating over this density. Numerical experiments show that the proposed approach produces intuitive significant parts, and demonstrate an improvement in the performance of partial matching between shapes.
    Download PDF (1018K)
  • Andrew C. Gallagher, Tsuhan Chen
    Article type: Special Issue on MIRU2008 Invited Paper
    Subject area: Invited Paper
    2009 Volume 1 Pages 115-126
    Published: 2009
    Released on J-STAGE: March 31, 2009
    JOURNAL FREE ACCESS
    Recognizing people in images is one of the foremost challenges in computer vision. It is important to remember that consumer photography has a highly social aspect. The photographer captures images not in a random fashion, but rather to remember or document meaningful events in her life. Understanding images of people necessitates that the context of each person in an image is considered. Context includes information related to the image of the scene surrounding the person, camera context such as location and image capture time, and the social context that describes the interactions between people. The goal of this paper is to provide the computer with the same intuition that humans would use for analyzing images of people. Fortunately, rather than relying on a lifetime of experience, context can often be modeled with large amounts of publicly available data. Probabilistic graph models and machine learning are used to model the relationship between people and context in a principled manner.
    Download PDF (1931K)
  • Hideo Saito, In So Kweon
    Article type: Preface
    Subject area: Preface
    2009 Volume 1 Pages 127
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    Download PDF (30K)
  • Yasuhiro Mukaigawa, Kazuya Suzuki, Yasushi Yagi
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 128-138
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    The scattering effect of incident light, called subsurface scattering, occurs under the surface of translucent objects. In this paper, we present a method for analyzing the subsurface scattering from a single image taken in a known arbitrary illumination environment. In our method, diffuse subsurface reflectance in the subsurface scattering model can be linearly solved by quantizing the distances between each pair of surface points. Then, the dipole approximation is fit to the diffuse subsurface reflectance. By applying our method to real images of translucent objects, we confirm that the parameters of subsurface scattering can be computed for different materials.
    Download PDF (1284K)
  • Ryo Furukawa, Hiroshi Kawasaki, Ryusuke Sagawa, Yasushi Yagi
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 139-157
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    Shape acquisition of moving deformable objects with little texture is important for applications such as motion capture of human facial expression. Several techniques using structured light have been proposed. These techniques can be largely categorized into two main types. The first type temporally encodes positional information of a projector's pixels using multiple projected patterns, and the second spatially encodes positional information into areas or color spaces. Although the former technique allows dense reconstruction with a sufficient number of patterns, it has difficulty in scanning objects in rapid motion. The latter technique uses only a single pattern, so it is more suitable for capturing dynamic scenes ; however, it often uses complex patterns with various colors, which are susceptible to noise, pattern discontinuity caused by edges, or textures. Thus, achieving dense and stable 3D acquisition for fast-moving and deformable objects remains an open problem. We propose a technique to achieve dense shape reconstruction that requires only a single-frame image of a grid pattern based on coplanarity constraints. With our technique, positional information is not encoded in local regions of a projected pattern, but is distributed over the entire grid pattern, which results in robust image processing and 3D reconstruction. The technique also has the advantage of low computational cost due to its efficient formulation.
    Download PDF (7459K)
  • Kazuaki Kondo, Yasuhiro Mukaigawa, Yasushi Yagi
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 158-173
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    A considerable issue in designing catadioptric imaging systems is what shape the component mirrors should be formed. In this paper, we propose a new algorithm for a catadioptric imaging system that satisfies the desired projection using a free-form mirror. A free-form mirror expressed as an assembly of gradients is a flexible surface representation that can form various shapes including non-smooth surfaces. We improve the shape reconstruction framework in the photometric stereo scheme to design free-form mirrors. An optimal mirror shape is formed to produce the desired projection under the integrability condition that requires it to be a consistent surface. We assume various catadioptric configurations, for which actual free-form mirrors are designed. The design experiments confirm that the resulting free-form mirrors can approximate the desired projections, including non-smooth ones.
    Download PDF (2134K)
  • Chika Takada, Yasuyuki Sugaya
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 174-182
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    We present a new method for detecting incorrect feature point tracking. In this paper, we detect incorrect feature point tracking by imposing the constraint that under the affine camera model feature trajectories should be in an affine space in the parameter space. Introducing a statistical model of image noise, we test detected partial trajectories are sufficiently reliable. Then we detect incorrect partial trajectories. Using real video images, we demonstrate that our proposed method can detect incorrect feature point tracking fairly well.
    Download PDF (2323K)
  • Masashi Sugiyama, Takafumi Kanamori, Taiji Suzuki, Shohei Hido, Jun Se ...
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 183-208
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    In statistical pattern recognition, it is important to avoid density estimation since density estimation is often more difficult than pattern recognition itself. Following this idea — known as Vapnik's principle, a statistical data processing framework that employs the ratio of two probability density functions has been developed recently and is gathering a lot of attention in the machine learning and data mining communities. The purpose of this paper is to introduce to the computer vision community recent advances in density ratio estimation methods and their usage in various statistical data processing tasks such as non-stationarity adaptation, outlier detection, feature selection, and independent component analysis.
    Download PDF (999K)
  • Bo Zheng, Ryo Ishikawa, Takeshi Oishi, Jun Takamatsu, Katsushi Ikeuchi
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 209-219
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    This paper presents a fast registration method based on solving an energy minimization problem derived by implicit polynomials (IPs). Once a target object is encoded by an IP, it will be driven fast towards a corresponding source object along the IP's gradient flow without using point-wise correspondences. This registration process is accelerated by a new IP transformation method. Instead of the time-consuming transformation to a large discrete data set, the new method can transform the polynomial coefficients to maintain the same Euclidean transformation. Its computational efficiency enables us to improve a new application for real-time Ultrasound (US) pose estimation. The reported experimental results demonstrate the capabilities of our method in overcoming the limitations of a noisy, unconstrained, and freehand US image, resulting in fast and robust registration.
    Download PDF (2989K)
  • Yousun Kang, Koichiro Yamaguchi, Takashi Naito, Yoshiki Ninomiya
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 220-230
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    This paper presents a new image segmentation method for the recognition of texture-based objects in a road environment scene. Using the proposed method, we can classify texture-based objects three dimensionally using the SfM (Structure from Motion) module and the HLAC (Higher-order Local Autocorrelation) features. By estimating the vehicle's ego-motion, the SfM module can reconstruct the three dimensional structure of the road scene. Texture features of input images are extracted from HLAC functions according to their depth, as obtained using the SfM module. The proposed method can effectively recognize texture-based objects of a road scene by considering their three-dimensional structure in a perspective 2D image. Experimental results show that the proposed method can not only effectively classify the texture patterns of structures in a 2D road scene, but also represent classified texture patterns as three-dimensional structures. The proposed system can revolutionize a three-dimensional scene understanding system for vehicle environment perception.
    Download PDF (3031K)
  • Xu Qiao, Rui Xu, Yen-Wei Chen, Takanori Igarashi, Keisuke Nakao, Akio ...
    Article type: Special Issue on MIRU2008
    Subject area: Research Paper
    2009 Volume 1 Pages 231-241
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    This paper introduces a framework called generalized N-dimensional principal component analysis (GND-PCA) for statistical appearance modeling of facial images with multiple modes including different people, different viewpoint and different illumination. The facial images with multiple modes can be considered as high-dimensional data. GND-PCA can represent the high-order dimensional data more efficiently. We conduct extensive experiments on MaVIC Database (KAO-Ritsumeikan Multi-angle View, Illumination and Cosmetic Facial Database) to evaluate the effectiveness of the proposed algorithm and compared the conventional ND-PCA in terms of reconstruction error. The results indicated that the extraction of data features is computationally more efficient using GND-PCA than PCA and ND-PCA.
    Download PDF (1491K)
  • Shohei Nobuhara, Yoshiyuki Tsuda, Iku Ohama, Takashi Matsuyama
    Article type: Regular Paper
    Subject area: Research Paper
    2009 Volume 1 Pages 242-259
    Published: 2009
    Released on J-STAGE: September 24, 2009
    JOURNAL FREE ACCESS
    This paper presents a novel approach for simultaneous silhouette extraction from multi-viewpoint images. The main contribution of this paper is a new algorithm for 1) 3D context aware error detection and correction of 2D multi-viewpoint silhouette extraction and 2) 3D context aware classification of cast shadow regions. Our method takes both monocular image segmentation and background subtraction of each viewpoint as its inputs, but does not assume they are correct. Inaccurate segmentation and background subtraction are corrected through our iterative method based on inter-viewpoint checking. Some experiments quantitatively demonstrate advantages against previous approaches.
    Download PDF (6070K)
  • Ngo Trung Thanh, Hajime Nagahara, Ryusuke Sagawa, Yasuhiro Mukaigawa, ...
    Article type: Regular Paper
    Subject area: Research Paper
    2009 Volume 1 Pages 260-276
    Published: 2009
    Released on J-STAGE: November 16, 2009
    JOURNAL FREE ACCESS
    The latest robust estimators usually take advantage of density estimation, such as kernel density estimation, to improve the robustness of inlier detection. However, the challenging problem for these systems is choosing the suitable smoothing parameter, which can result in the population of inliers being over- or under-estimated, and this, in turn, reduces the robustness of the estimation. To solve this problem, we propose a robust estimator that estimates an accurate inlier scale. The proposed method first carries out an analysis to figure out the residual distribution model using the obvious case-dependent constraint, the residual function. Then the proposed inlier scale estimator performs a global search for the scale producing the residual distribution that best fits the residual distribution model. Knowledge about the residual distribution model provides a major advantage that allows us to estimate the inlier scale correctly, thereby improving the estimation robustness. Experiments with various simulations and real data are carried out to validate our algorithm, which shows certain benefits compared with several of the latest robust estimators.
    Download PDF (2985K)
  • Kiyotaka Watanabe, Yoshio Iwai, Tetsuji Haga, Koichi Takeuchi, Masahik ...
    Article type: Regular Paper
    Subject area: Research Paper
    2009 Volume 1 Pages 277-287
    Published: 2009
    Released on J-STAGE: December 14, 2009
    JOURNAL FREE ACCESS
    There are two major problems with learning-based super-resolution algorithms. One is that they require a large amount of memory to store examples; while the other is the high computational cost of finding the nearest neighbors in the database. In order to alleviate these problems, it is helpful to reduce the dimensionality of examples and to store only a small number of examples that contribute to the synthesis of a high quality video. Based on these ideas, we have developed an efficient algorithm for learning-based video super-resolution. We introduce several strategies to construct an efficient database. Through the evaluation experiments we show the efficiency of our approach in improving super-resolution algorithms.
    Download PDF (1364K)
  • Daisuke Miyazaki, Mahdi Ammar, Rei Kawakami, Katsushi Ikeuchi
    Article type: Regular Paper
    Subject area: Technical Note
    2009 Volume 1 Pages 288-300
    Published: 2009
    Released on J-STAGE: December 14, 2009
    JOURNAL FREE ACCESS
    In outdoor scenes, polarization of the sky provides a significant clue to understanding the environment. The polarized state of light conveys the information for obtaining the orientation of the sun. Robot navigation, sensor planning, and many other application areas benefit from using this navigation mechanism. Unlike previous investigations, we analyze sky polarization patterns when the fish-eye lens is not vertical, since a camera in a general position is effective in analyzing outdoor measurements. We have tilted the measurement system based on a fish-eye lens, a CCD camera, and a linear polarizer, in order to analyze transition of the 180-degree sky polarization patterns while tilting. We also compared our results measured under overcast skies with the corresponding celestial polarization patterns calculated using the single-scattering Rayleigh model.
    Download PDF (3371K)
feedback
Top