NIHON GAZO GAKKAISHI (Journal of the Imaging Society of Japan)
Online ISSN : 1880-4675
Print ISSN : 1344-4425
ISSN-L : 1344-4425
Volume 62, Issue 6
Displaying 1-10 of 10 articles from this issue
Regular Paper
  • Achmad Rofi IRSYAD, Kazuyoshi FUSHINOBU, Masami KADONAGA
    2023 Volume 62 Issue 6 Pages 568-577
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL FREE ACCESS

    Understanding the evaporation phenomena of ink would be beneficial to realize an inkjet printer that produces a smaller carbon footprint. In this paper, the ink is represented by a ternary mixture of water, PG (propylene glycol), and GL (glycerol). Evaporation of droplets with sizes of 100pL was observed. The experiment result for all mixture compositions shows a similar pattern for evaporation stages. Water has a high evaporation rate in the initial 1∼3 seconds, around 6.2×10-3kg/m2·s, and then proceeds to significantly decrease to 3.9×10-6kg/m2·s as the effect of interaction with the relative humidity in the surrounding environment. Deviation on the empirical calculation is adjusted by reducing the water and PG component as consideration of the evaporation in the nozzle tip during the idle time before ejection, as well as introducing an activity coefficient of 2.5 for PG to accommodate the non-ideality of the mixture. A detailed investigation by a numerical method is expected for future research.

    Download PDF (1485K)
Special Topic
  • Motoi IWATA, Hiroyuki ARAI, Shuichi MAEDA, Nobuyuki NAKAYAMA
    2023 Volume 62 Issue 6 Pages 578
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS
    Download PDF (22K)
  • Takehiro AOSHIMA, Takashi MATSUBARA
    2023 Volume 62 Issue 6 Pages 579-587
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS

    The creation of images and other data is one of the ultimate goals of computer vision research. For this purpose, various deep learning methods have been proposed, such as variational autoencoders, adversarial networks, and diffusion models. These methods learn the distributions of photographs and illustrations and reproduce them. The generated image is determined using the coordinates provided in the latent space. Therefore, several studies have been conducted to manipulate these coordinates to edit the generated images. However, existing methods frequently provide unintended or low-quality editing results because the coordinate system in the latent space is not properly learned, among other reasons. In this study, we focus on the coordinate system in the representation space and introduce deep curvilinear editing. In particular, we propose a method for the representation vectors using representation space with a curvilinear coordinate system. The method was also combined with generative adversarial networks, whose results demonstrated that the proposed method enables the high-quality editing of generated images.

    Download PDF (2429K)
  • Duc Minh VO, Quoc-An LUONG, Akihiro SUGIMOTO, Hideki NAKAYAMA
    2023 Volume 62 Issue 6 Pages 588-598
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS

    In this review, we introduce a novel image captioning task, called Anticipation Captioning, which generates a caption for an unseen image given a sparsely temporally-ordered set of images. Our task emulates the human capacity to reason about the future based on a sparse collection of visual cues acquired over time. To address this novel challenge, we introduce a model, namely A-CAP, that predicts the caption by incorporating commonsense knowledge into a pre-trained vision-language model. Our method outperforms image captioning methods and provides a solid baseline for anticipation captioning task, as shown in both qualitative and quantitative evaluations on a customized visual storytelling dataset. We also discuss the potential applications, challenges, and future directions of this novel task.

    Download PDF (2261K)
  • Yuta NAKASHIMA, Yusuke HIROTA, Yankun WU, Noa GARCIA
    2023 Volume 62 Issue 6 Pages 599-609
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS

    Vision-and-Language is now one of the popular research areas, which lies between computer vision and natural language processing. Researchers have been tackling various tasks offered by dedicated datasets, such as image captioning and visual question answering, and built a variety of models for state-of-the-art performance. At the same time, people are aware of the bias in these models, which can be especially harmful when the bias involves demographic attributes. This paper introduces our recent two works presented at IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023. The first work sheds light on social bias in a large-scale, uncurated dataset, which is indispensable for training recent models. The second work presents a model-agnostic framework to mitigate gender bias for arbitrary image captioning models. This paper gives high-level ideas about these works, so interested readers may refer to the original works.12,16)

    Download PDF (1360K)
  • Ying JI, Yu WANG, Jien KATO
    2023 Volume 62 Issue 6 Pages 610-621
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS

    Providing explanation and interpretability for CNNs has received considerable interest in recent years. Due to the high computation cost and complexity of video data, the explanation of 3D video recognition CNNs is relatively less studied. Moreover, existing 3D explanation methods are not able to produce a high-level explanation. In this paper, we provide a comprehensive introduction to a 3D explanation model that is not only capable of producing a human-understandable high-level explanation for 3D CNNs, but is also applicable to real-world applications. The Spatial-Temporal Concept-based Explanation (STCE) framework is composed of two steps : (1) the videos are segmented into multiple supervoxels, similar supervoxels are clustered as a high-level concept;and (2) the interpreting framework calculates a score for each concept, with a high score indicating that the network gives the concept more attention. STCE's success in video recognition enables its application to real-world tasks, such as social relation atmosphere recognition.

    Download PDF (1975K)
  • Yusuke YOSHIYASU, Louise ALLAIN
    2023 Volume 62 Issue 6 Pages 622-632
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS

    In this review paper, we report our model for recovering a 3D human mesh from a single 2D monocular image, called Deformable mesh transFormer (DeFormer) 1) which was published at the CVPR 2023 conference. While the current state-of-the-art models enable good performances by taking advantage of the transformer architecture to model long-range dependencies on input tokens, they suffer from a high computational cost due to the use of the standard transformer attention mechanism whose complexity is quadratic in the input sequence length. Therefore, we developed DeFormer, a human mesh recovery method that is equipped with two computationally efficient attention modules : 1) body-sparse self-attention and 2) Deformable Mesh cross-Attention (DMA). Experimental results show that DeFormer is able to efficiently leverage multi-scale feature maps and a dense mesh, which was not possible by previous transformer approaches. As a result, DeFormer achieves state-of-the-art performances on Human3.6M and 3DPW benchmarks. Code is available at https://github.com/yusukey03012/deformer.

    Download PDF (2193K)
  • Ryo KAWAHARA, Meng-Yu Jennifer KUO, Shohei NOBUHARA
    2023 Volume 62 Issue 6 Pages 633-639
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS

    This paper proposes a practical method of microscale 3D shape capturing by a teleidoscopic imaging system. The main challenge in microscale 3D shape reconstruction is to capture the target from multiple viewpoints with a large enough depth-of-field. Our idea is to employ a teleidoscopic measurement system consisting of three planar mirrors and monocentric lens. The planar mirrors virtually define multiple viewpoints by multiple reflections, and the monocentric lens realizes a high magnification with less blurry and surround view even in closeup imaging. Our contributions include, a structured ray-pixel camera model which handles refractive and reflective projection rays efficiently, analytical evaluations of depth of field of our teleidoscopic imaging system, and a practical calibration algorithm of the teleidoscopic imaging system. Evaluations with real images prove the concept of our measurement system.

    Download PDF (1580K)
Imaging Highlight
  • Koromo SHIROTA
    2023 Volume 62 Issue 6 Pages 640-645
    Published: December 10, 2023
    Released on J-STAGE: December 10, 2023
    JOURNAL RESTRICTED ACCESS

    Digital textiles are a method of printing digital data (images) directly on fabric without using screens or other plates. Due to the recent rise in global environmental awareness and changes in the social environment surrounding consumers due to the COVID-19, apparel using digital textiles is becoming increasingly popular around the world. However, digital textiles are not well recognized in the Japanese fashion industry. In addition, there was a lack of a cross-industry community. With the aim of education and developing digital textiles in Japan, we established The Digital Textile Technical Committee in 2017. This paper reports on the activities of this committee to date.

    Download PDF (1973K)
Lectures in Science
feedback
Top