ICP is a well-known method which is widely used for 3D point cloud registration. However， ICP
has some problems that success and failure of registration strongly depends on initial allocation of point clouds
and processing time is long. In this report， we investigate how initial allocation of target and source point clouds
influences the success and failure and processing time in 3D point clouds registration by ICP. Then， we suggest
some points to solve these problems based on this examination.
Registration of multi-view point clouds is a task in 3D computer vision used for surface reconstruction which researchers have improved its efficiency and accuracy. In spite of the efforts and successes, just a few approaches
address the specific problem of carrying out registration of point clouds with low overlapping ratio. In this report. we present a method that focuses on finding the rigid transformation that leads to the optimum registration between a
pair of unorganized point clouds with low overlapping. This method exploits the characteristic of subset belonging
from the well-known ICP registration algorithm, and through a Hough transform-like search looks for the points which
neighboring points registration leads to a rigid transformation close to the optimal.
In order to locate the damaged position of the gas pipe embedded in the ground by a non-excavation method, as one of the element technologies for acquiring the absolute coordinates of the damaged position from the image acquired by the gas pipe exploration robot, this paper proposes a method for estimating the distance the robot goes straight in the gas pipe. The proposed method applies, as a first step, geometric transformation to each frame of the video images, and , as a second step, estimates the distance the robot goes straight using the geometric transformation. More specifically, in the first step, the image acquired under the condition in which the central distance (the distance between the camera’s optical axis and central line of the gas pipe) is not 0[mm] is converted to the image whose central distance is 0[mm]. In the second step, the trajectory of the texture (feature points) on the wall surface of the gas pipe is obtained as optical flows obtained from the converted image. By converting the length of the trajectory to the absolute distance, the distance the camera (robot) advances. In the experiments, videos are acquired by the camera attached to the robot, which goes straight for 100[mm], where the central distance is changed from 0[mm] to 3[mm], and the proposed method is applied to each of the videos. Good estimation accuracies are obtained.
We tried to embed almost invisible information on cloth (fiber material) in our research. To realize this, we utilize ‘’Artificial fiber pattern’’ that is considered to embed information with suppressing uncomfortable feeling for human eyes. But, this method is only developed for paper. In this study we verify that ‘’Artificial fiber pattern’’ can be applied to cloth by scanner, still image, and video image. Then it turned out to be the different examination characteristic between fiber material and paper. After applying the special noise filter, we get the sufficient information extraction accuracy from the video image which is considered to be the most difficult.
For learning human voice production mechanisms, or for future development of speech utterance robots which emulate human voice production, a pharynx figure model with movable vocal folds parts was developed. Currently, there are many human figure models for learning anatomy of human body, but it is hard to understand voice production mechanism by vocal folds. Our developed model has movable vocal folds parts for better understanding of vocal organs. Rubber balloon， resin clay and stainless wires are used to make the model． Parts of thyroid cartilage， cricoid， arytenoid cartilage are made with clay. Vocal folds are made with thin rubber which is cut from a rubber balloon. Using stainless wires, they are connected. By pulling and pushing cartilages, vocal folds open and close. When the vocal folds are closed. air flow from bottom tube makes sound.
We have demonstrated effectiveness of shortening of display latency on head tracking system and displaying visual information at time of changing motion direction on stable depth perception from monocular motion parallax. We also have shown that smooth motion parallax even at fixed head improves perceived depth degradation caused by decreasing visual acuity of one eye. In this paper, we discussed importance of motion parallax information on both monocular and binocular depth perception, and utilizing motion parallax for realizing more effective 3D display methods.
Lung cancer is the leading cause of cancer-related mortality in Japan. Screening for lung cancer with low-dose CT has led to increased recognition of small lung cancers and is expected to increase the rate of detection of early-stage lung cancer. Major concerns in the implementation of the CT screening of large populations include determining the appropriate management of pulmonary nodules found on a scan. In this report, we present our approaches to developing the CADe/CADx system in the low-dose CT screening for lung cancer.
We propose and evaluate a technique of controlling the perceived depths of virtual objects in mixed reality
accomplished by optical see-through. 1n this technique，the effect of perception of contact with objects on depth perception is
utilized to control the perceived depths of virtual objects， i. e.， the virtual objects are presented at the same depths as real
objects by utilizing the perception of contact between the real objects and the virtual objects. [n the evaluation， the perceived
depths of the virtual objects are measured， and the results indicate that the virtual objects are perceived at the same depth of the
real objects by the perception of contact between the real objects and the virtual objects. These findings demonstrate the
feasibility of the proposed technique.
Visualization has been widely used for the analysis of various numerical datasets, and it is crucial to determine appropriate parameter values for obtaining good visualizations. However, the evaluation of a visualization result is mostly carried out in a subjective manner, on the basis of the facts whether necessary information can be seen from the resulting image, whereas a way to quantitatively assess whether it is optimal is not available. We, therefore, focus on volume rendering, which is a traditional visualization technique for three-dimensional datasets, and propose a new index for quantitatively evaluating the quality of resulting images. Specifically, Shannon's information entropy, which defines the amount of information, was extended to come up with a function to evaluate the quality of volume rendered images. The validity of the proposed metric was proven through application to multiple cases of visualization.
Most of the stereoscopic display method adopted for movies and games uses binocular parallax. On the other hand, a group of videos "Split Depth GIFs " that gives a stereoscopic effect with either one of the left and right eyes has been proposed and is attracting attention. The method used in the videos is a simple one of displaying mainly two lines in the vertical direction on the screen in the movie, which is a relatively new method which appeared around 2014. However, sufficient study has not been made as to how to create contents with strong three-dimensional effect by this method. In this study, we examine methods to enhance the strength of the illusion used for "Split Depth GIFs", and produce our animation based on the result.
In this paper, a support tool for generating water flow animation from a single water flow image without requiring special skill of use is proposed. A method for creating a water flow animation by compositing water flow videos had been proposed by Okabe et al. However, this method requires some degree of knowledge of image processing for specifying the direction of water flow and searching similar images. If this problem is solved, it is expected that the range of user of this method may expand. As a solution to this problem, an interactive support tool for generating a water flow video based on the Okabe's method is proposed and the usefulness of it was confirmed with some examples.
Wild animals such as deer and boars cause serious troubles on the agriculture and the amount of damage in FY2017 is reported to be 16.4 billion yen nationwide. Aiming at the development of countermeasures against damage by wild animals based on the image recognition technology, we construct a dataset of animal images for machine learning and perform a trial of recognition by deep learning. By clipping and labeling the images taken by field cameras, a dataset of totally 14,226 animal images is constructed and disclosed for public use. Classifiers by deep learning are constructed and trained with images of deer, boars, and monkeys from the dataset together with images of raccoons and raccoon dogs taken in a zoo and with images of CIFAR-10. Experiments demonstrate that the classifiers achieve about 80% of recognition accuracy.
A large-scale natural disaster leaves its traces on the map. Wide and precise analysis of the temporal change of the map gives detailed and quantitative information for disaster prevention research. Aiming to develop automatic or computer-aided map analysis system, this manuscript proposes an automatic classification method of modern Japanese maps by land-use, such as farmland, urban area, forestry, etc. For maps drawn manually and printed on paper medium, map symbols are recognized by template matching, broken border lines are completed by opening, and region is partitioned by labeling. As for the regions where boundary lines are omitted, the boundary is inferred based on the distance to the map symbol. An urban area is deduced from dense drawing and excluded from the area classification with the map symbols. A demonstration of the proposed method to plain region in Osaka on 1:25,000 map measured in the Taisho era showed about 74% of true positive rate and about 13% of false positive rate.
Oracle bone inscriptions (OBIs) are the character which are used in the ancient Chinese dynasty more than 3000 years old. Deciphering OBIs is very important for the study of ancient China, Kanji history and so on. Hence, various computer based methods are proposed for OBIs recognition recently. However, these methods are limited on the OBIs character should be extraction extracted from the OBIs rubbing image previously. In this paper, we investigate an automatic OBIs character extraction method and develop a support environment for various parameters decision. We apply global binarization, opening processing, and local binarization to damaged oracle bone images at first. Next, character string regions are extracted using oval Gaussian filter. Then, the regions are segmented by histogram. At last, we evaluated this character extraction method using F-measure.
In recent years, surveillance systems have developed and river surveillance systems with many researches about water level detection enable many rivers to prevent water disaster. However, water level detections with the river surveillance camera are less. In this paper, we present a method to extract the water level fluctuation with river surveillance camera images. First, we summarize input videos with a shake correction and decrease colors in images with Meanshift. Next, we segment river parts with k-means method. Next, we identify river widths with segmented river parts and a line which we manually select. Finally, we judge whether the water level are increasing with identified river width and a border line which we assume the river flood. As a result of judging for many images, we have problems which identify river sides as river parts and imperfectly segment river parts in the river.
Spectral rendering makes it possible to visualize wavelength-dependent phenomena such as dispersion of
light and renders more variety of scenes compared to RGB rendering that calculates the intensity of light in only three
components. Spectral rendering, however, has a problem of expensive computational cost, because the intensity of light
should be calculated in every wavelength over the wide range of the visible light spectrum. To address the problem, we
have developed a spectral rendering combined with Image-Based Lighting (IBL) where an ambient light is recorded in
a spectral image and the spectral intensity of light is calculated in global illumination environment. Several examples
demonstrate the proposed method efficiently renders various kinds of scenes with wavelength-dependent phenomena.
This paper describes a technique that can invisibly attach an information on a surface of real objects by using an illumination light in which temporally and spatially intensity of color modulated, and extract it from video image captured with video camera. We used string pattern as an information in this technique. We first assessed invisibility in terms of colors for a temporally and spatially intensity modulated pattern in the light. We also examined readability of an invisibly attached pattern on the real objects captured with a video camera. The results obtained from these experiments revealed that the invisibility with which the string pattern was embedded in the blue color component was higher rather than for the red and green components. Moreover, we confirmed a range in which invisibility and readability was compatible when the pattern was embedded in the blue color component in the light.
Illuminant color estimation under multiple illuminants in an image is proposed. In the conventional methods for estimating multiple illuminants, estimation method for one illuminant is applied by assuming that the color of the illuminant in each small region of the image is constant. Several methods for estimating one illuminant are used, however there is no evidence which methods are appropriate for the estimation and each method is applied under inadequate condition. In the proposed method, gray-world-assumption based method is focused on because the algorithm is simple and easy to implement, moreover improved methods are proposed these days. By dividing the image into several size of small region and applying the gray-world-assumption method for the small region when the region satisfies the method, local illuminant color in each small region is estimated. Unifying the local illuminants derived from the different sizes of small region, scene illuminants are estimated. Mondrian pattern images are used in the experiments and validation of the proposed method is shown by the experimental results.
Mathematical statistics, especially hypothesis test, is available in various fields. In many cases, however, people don’t understand the essence of statistics easily. In this research, we propose a method to understand the essence of hypothesis test easily by visualizing and try to make interface for education. Visualization policy is that users can understand hypothesis test intuitively. For this policy, we propose a method which graphics change the shape and color with users’ number input and design an educational interface as an interactive teaching material. And we conducted experiment to compare existing method with our method. As a result, our method enhances a learning effect and has good UI.
As an example of using a panoramic image, there is what is called a route panorama in which a landscape seen in a certain route is made into one panoramic image. So I propose a new generation method of route panorama using landscape visible in traveling direction. Also, as a characteristic of people who are lost in the way, grasping marks and memories are ambiguous, and the use of maps is not good. So we aim to create a route guidance map that can display the route between specific points by using route panorama image of traveling direction and can see the landscape of the whole route.
This paper proposes a method for estimating stress caused by changes in the pace of meal assistances using multiple biometric information obtained by facial image analyses. First, six Biological information such as heart rate, heart beat interval, heart rate variability(HF, LF, LF/HF), pupil size are detected from the face image. Second, we compare four machine learning methods: Naïve Bayes classifier, SVM, Multi-layer Perceptron, and LSTM in order to detect the stress by inputting the six biometric information. As a result of experiments in which calculation tasks are the stress target, LSTM achieves the best accuracy for detecting the stress. On the other hand, as a result of experiments in which the target is changing pace of meal assistance, high stress estimation accuracy was not obtained.
As image processing technologies that can contribute to the actualization of a dental assistant robot, this paper proposes a method for detecting the lips from the facial images and recognizing the inner-mouth area, which is needed for enabling the robot to insert the vacuum into the patient’s mouth. Specifically, the lips are detected by linear SVM that utilized HOG features. The area of the mouth, in which the vacuum can be inserted, is recognized by approximating the lips by an ellipse and performing clustering based on HSV pixel values in that area. As a result of experiments, the lips and area to be inserted can be detected at quite high accuracies. However, if the lips are occluded, the accuracies of detecting the lips are lowered. Concerning the recognition of the area to be inserted, the recall is smaller than the precision.
Gaze information such as gaze points, fixations points, saccades are valuable information that provide clues to reveal visual characteristics of subjects on proceeding psychological experiments, marketing, and visual field inspections. However, the gaze data derived from today’s eye trackers is limited to two-dimensional form, whereas studies on higher order visual processing need three-dimensional of gaze data to enable observations on processing spatial location and its relationship to the visual components. In this study, we have developed an eyeglass-type eye tracker based on vergence eye movements, which enable three-dimensional gaze measurements and investigate the spatial characteristics of the gaze for given points in a three-dimensional space. To ensure vergence eye-movements during gaze calibration, our eye tracker adopted gaze calibration in a three-dimensional virtual space displayed on a 3D display. Our experiment results show significant correlation between gaze points in the real and the virtual space. Moreover, gaze calibration using 3D display is found to be effective on forcing vergence eye-movements to yield reliable three-dimensional gaze measurements.