Planes are simple shapes often found in urban environments. Their detection is crucial for several applications of 3D point clouds. Efforts have focused on detecting planes in low-cost and lowrange scenarios. However, in Terrestrial Laser Scanning (TLS), the range is extended to a few hundred meters. This long-range adds huge variations in points’ density that complicates defining proper thresholds for conventional methods. Therefore, we propose to employ the sliding voxel plane detector over multiple voxel sizes to estimate hypothetical planes in a coarse-to-fine way, followed by a merging Non-Maximum Suppression to detect planes robustly. According to experimental results, the proposed method shows superior precision, efficiency, and scalability in TLS point clouds.
We propose an efficient semantic segmentation method for a large-scale point cloud. Previous point-based semantic segmentation methods to large-scale point clouds have been difficult. This is because those methods infer semantic labels to all the points used for feature extraction, and large-scale point clouds easily exceed their capacity. To solve this problem, we propose a novel point-based approach that predicts class labels for a downsampled point cloud and expands the labels to the whole point cloud by nearest-neighbor interpolation. The key idea of our approach is to give local features derived from the whole point cloud to each sample point by the newly developed Aggregative Input Convolution (AIC) and convert those features into wider context features by a point-based model for small-scale point clouds. AIC was experimentally confirmed to improve semantic segmentation accuracy on a large-scale dataset.
Registration of histopathological images obtained from different staining techniques is very challenging because of much difference of their color information. In this study, we propose a promising image registration method that can overcome the color difference of H&E and EVG stained images by means of GAN-based color conversion. Our proposed method consists of two main parts: one is GAN based unsupervised domain adaptation network for converting H&E stained image to EVG stained image which has similar distribution with the original EVG stained image and the other is SURF feature based registration framework which provides the registered EVG stained image leveraging the generated EVG stained image obtained from the domain adaptation network. The experimental result shows that our proposed method is able to provide better registration result than the conventional method where domain adaptation technique is not incorporated.
Hyper-spectral images are used in a wide range of fields such as industry, medicine, remote sensing, and so on. They are also used in computer graphics as light probe images and textures in spectral rendering. The acquisition of spectral images is, however, costly in terms of equipment and time, which hinders its acquisition and use. Conventional spectral super-resolutions using deep learning have been adopting a direct end-to-end learning method to RGB and hyper-spectral images. In contrast, we focus on the fact that hyper-spectral images are decomposed into luminance and chrominance components, and we propose a novel spectral super-resolution using a deep learning to estimate each component separately. Finally, in the proposed method, a hyper-spectral image is reconstructed by combining the estimated luminance and chrominance components.
In Digital Pathology, histological slides digitized allow the use of techniques for automatic procedures in histopathology, permitting the automatic quantification of the rate of sclerosed glomeruli to order the biopsy slides so that the most serious cases can be identified more quickly. In this work, we evaluate the YOLOv3 as deep neural network to identify glomeruli in WSI and classify in functional and sclerosed glomeruli. This work used the framework YOLOv3, with 53-layers convolutional neural network, and 30 complete slides from the Bio-Atlas repository (Pennsylvania State University), which resulted in 2448 images of 1024x1024 pixels with one or more glomeruli, used for training and performance evaluation. A total of 585 sclerosed glomeruli and 3383 functional glomeruli were labeled. Through the experiments, we achieve high performance in identification and classification of glomeruli (e.g., recall of 96.8%, precision of 95.9%, accuracy of 98.1%, and an F1 score of 96.3%). The method is capable to identify and report the location of the glomeruli on the slide, classify the glomeruli in functional and sclerosed, and precisely provide the percentage of sclerosed glomeruli, allowing support for a histopathological study of kidney diseases in the medical routine.
Eyeliner is one of the makeup methods that enhances the attractiveness of female by a geometric illusion that makes the eyes appear larger. Eyeliner has no fixed shape, and its thickness and length can be freely adjusted. In this paper, we experimentally verify the relationship between thickness of eyeliner and the perceived eye size. In addition, by examining the sense of incongruity of the eyeliner, we clarify the optimal thickness of eyeliner as makeup to make the eyes appear larger. The results showed that the thicker eyeliner, the larger the eyes were perceived to be. However, the excessive thickness of eyeliner increased the sense of incongruity and reduced the illusion effect. There were gender differences in the results of these experiments. For male, thickness of eyeliner had a significant effect on the perception of eye size, while for female it had no significant effect.
This study aims to design an apparel item by converting a polygon model of the human body obtained through a 3D scan into a volume model. From the model, we can quickly generate a curved surface that retains a constant distance from the surface of the body, thus being able to design an apparel item that ensures accurate allowance by defining the distance field for the surface of the body. The curved surface, which would be an ideal garment that can envelop the body, can be generated by tweaking the threshold value for an isosurface from the surface of the body. Furthermore, by drawing design lines on the curved surface, the pattern required for actual sewing can be designed in 3D space. The curved surface trimmed by the design lines is converted into a polygon mesh. A dynamics simulation is applied to smooth the curvatures so that the desired flat surface pattern can be obtained. As a result of actually sewing the virtually designed pattern and trying it on, it was shown that we can generate clothes with the appropriate allowance and proportions.
The demand for privacy protection has been increasing with the widespread use of devices that can easily take high-resolution images, such as digital cameras and smartphones. In particular, fingerprint information is one of the targets for privacy protection, but there is no research which specifically deals with fingerprint information removal to our best knowledge. In this paper, we propose a method for reversibly replacing fingerprints in an image with another fake fingerprint. This method makes it possible to automatically remove the original fingerprint information in the input image and generate a natural image. Moreover, the input image ’s fingerprint information can be easily restored from the output image only by specific persons who know the key used in the image generation process. In conclusion, we confirmed that the fingerprint information removal and restoration methods are effective by using the fingerprint authentication model.
Accurate typing in the home position is recommended to type efficiently on a PC. For this reason, a lot of software has been developed to learn typing in the home position. However, few of them can determine the actual fingering simply by presenting the location of the fingers. Therefore, the system to automatically determine have been developed whether the fingering is in the correct home position using a Leap Motion. Using this discrimination system, the Typing Learning Game is constructed that can correct subjects to type in the home position. Thus, by handling the fingering data of these subjects, efficient learning can be achieved. In the experiment, 15 subjects typed 1800 characters each, and the discrimination accuracy of the Fingering Discrimination System was 98.8%, and the fingering was improved by 25.6% from the results of the Typing Learning Game. The results show that the proposed automatic Fingering Discrimination System can accurately determine the fingering of the subject.
The distribution of easily imitated counterfeit products, such as food packaging, brand tags, and pharmaceutical labels, has become a serious economic and safety concern. To address this issue, we propose a system for authenticity judgment of genuine and counterfeit products with high speed and accuracy, focusing on the physically unclonable function of an inkjet-printed code and a locally likely arrangement hashing (LLAH) system that performs high-speed image retrieval. In this study, we verified that the proposed system has high discriminability and stability, based on highly accurate results obtained from a dataset of up to 4,000 sheets. In addition, the effectiveness of the system was also confirmed by validating it on multiple printers and comparing it with Oriented FAST and Rotated BRIEF (ORB), a typical feature matching method, in terms of discriminability and speed.
3D face reconstruction and face alignment are two highly relevant topics in face research. However, for these tasks, computational complexity is another consideration besides the training accuracy of the model. Our goal is to regress the 3D facial geometry and dense correspondence information from the given 2D image. Thus, in this paper, we fit the 3D morphable model based on a lightweight convolution neural network of the ShuffleNetV2 Plus series network and channel-wise attention model, which can improve the representation ability of the network and the performance of the 3D face reconstruction task without increasing the number of network parameters. Evaluations on test datasets show that our approach achieves significant performance improvements on both 3D face reconstruction and dense face alignment tasks. Alignment performances evaluate on AFLW2000-3D, and our method obtains a lower mean Normalized Mean Error (NME(%)) of 3.694.
This study aims to investigate new design method rooted in the history and culture of a region through photogrammetry and the reconstruction of shapes as an expansion of the possibilities of using photogrammetric technology, which has advanced rapidly in recent years. Specifically, as an example of something rooted in local history and culture, we focused on Maruko-bune, which were developed as the boats unique to Lake Biwa in Japan but are no longer built or used. We measured the Maruko-bune using photogrammetry and constructed its 3D model. We then attempted to design architecture by decomposing and reconstructing the shape of the 3D model. As a result, we found that the photogrammetry and 3D modeling technologies have reached the point where the shape of the Maruko-bune can be measured and reproduced by photogrammetry and 3D modeling using software and a smartphone camera that can easily be used without special calibration by architects and designers who are not experts in photogrammetry, and its curves and surfaces can be used for architectural design.
This paper proposes a method that improves the quality of omnidirectional free-viewpoint images by generative adversarial networks. Omnidirectional images are a popular way of obtaining threedimensional (3D) visual information, while free-viewpoint images are essential to Virtual Reality (VR) and Mixed Reality (MR) applications. Therefore, we generated free-viewpoint images with 3D information estimated by the captured omnidirectional images. The quality of the generated images is deteriorated by the 3D reconstruction error due to occlusion and miss-correspondences. In this work, we proposed a method that uses Generative Adversarial Networks (GAN) to solve this problem. We focused on the structural information of various perspectives and applied a “divide and conquer” approach by separating the images into perspectives before training and recombining them at a later stage. At the same time, we conducted a comprehensive, multi-faceted evaluation of the proposed method to verify its effectiveness in improving image quality. Based on the actual information distribution in the equirectangular images, we analyze the adaptability of different image quality evaluation methods. After careful assessment, we consider that the proposed method can generate highly accurate, omnidirectional free-viewpoint images.
The purpose of this study was to develop a computerized classification method for 1p/19q codeletion in low grade gliomas (LGGs) from brain MRI (magnetic resonance imaging) images using three dimensional (3D) radiomics features. Our database consisted of brain T2 weighted MRI images (102 LGGs with 1p/19q codeletion and 57 LGGs without it) obtained from 159 patients. In the proposed method, 107 3D radiomics features were extracted from LGG region in T2 weighted MRI images. The feature selection was performed with a least absolute shrinkage and selection operator to reduce redundancy among the extracted 3D radiomics features. A support vector machine (SVM) with the selected 3D radiomics features evaluated the likelihood of 1p/19q codeletion in LGG. A three-fold cross validation method was employed to train and test the proposed method. The classification accuracy, the sensitivity, the specificity, and the area under the receiver operating characteristic curve with the proposed method were 80.5%, 83.3%, 75.4%, and 0.836, respectively, showing an improvement when compared with SVM using 2D radiomics features (74.2%, 78.4%, 66.7%, and 0.783; p = 0.03). The proposed method with 3D radiomics features achieved high classification accuracy for 1p/19q codeletion in LGG from brain MRI images and would be useful for determining the patient managements.
The facial alignment task has been well-studied extensively and achieved significant progress in recent years. However, previous works remain challenging due to the ambiguity of invisible landmarks under extreme viewpoints (e.g., large pose and expression). This paper proposes a novel dense network on top of the coordinate regression method to improve face coordinates localization in extreme environments. The attention mechanism can better guide the model on which information to emphasize or suppress. In our network, we embed a Convolutional Block Attention Module (CBAM) in three stages of the densely connected convolutional network (DenseNet) respectively for face alignment tasks. Then we concatenate landmark location labels with bounding box location and head pose value as the guide information for our network. We demonstrate the performance of our network through the AFLW2000-3D, AFLW2000-3DReannotated, Menpo-3D test datasets. Comparative experiments reveal that our network with lower mean NME (3.2%) outperforms the baseline DenseNet (3.66%), ShuffleNet (4.39%), and DenseNet+SE (3.93%) on AFLW2000-3D-Reannotated. We conclude that our network obtains improved performances for face landmarks prediction even in extreme conditions.