Reports of the Technical Conference of the Institute of Image Electronics Engineers of Japan
Online ISSN : 2758-9218
Print ISSN : 0285-3957
Current issue
Displaying 1-21 of 21 articles from this issue
  • Motomasa HONGO, Naoki KITA, Takafumi SAITO
    Session ID: 24-03-01
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Existing algorithms can automatically generate traditional interlocking puzzles, but they only allow the designer to specify the puzzle's shape and number of pieces, offering no control over the solution. In this study, we propose a method that allows designers to specify the movement direction of the pieces, enabling the creation of puzzles more aligned with the designer's vision. Previous research achieved this for two-dimensional puzzles with a frame. This paper focuses on exploring the disassembly procedure for three-dimensional interlocking puzzles. By repeatedly generating pieces to sandwich the previously created piece, the desired movement direction for the pieces is ensured.
    Download PDF (560K)
  • Minori MICHIMOTO, Takafumi SAITO
    Session ID: 24-03-02
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Moiré animation is a method of animation expression realized by the continuous movement of interference fringes. One of the challenges in the field of moiré animation is that only a limited range of motion is realized, resulting in a narrow range of expression. In this study, we propose a method to create moiré animation using video as input. The proposed method integrates moiré animation with slit animation, which is a similar animation expression method, to achieve an animation expression that combines the advantages of both animation methods. As a result, moiré animations can be created for movements that were difficult to create using existing methods.
    Download PDF (686K)
  • Huadong ZHU, Hironobu ABE
    Session ID: 24-03-03
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    In the production of 2D character animation, it is necessary to segment the original 2D illustration into individual parts, such as eyes and hair. However, manual segmentation poses challenges due to the significant increase in processing time. This study proposes a segmentation method specifically targeting hair parts using Mask2Former, a variant of Vision Transformer (ViT). Evaluation experiments conducted with a dataset demonstrated the effectiveness of the proposed method.
    Download PDF (743K)
  • Yifeng ZHOU, Hironobu ABE
    Session ID: 24-03-04
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Developing games with moving characters requires a large number of face images. In recent years, an increasing number of services have been using AI to automatically generate face images, but a certain number of abnormal images are found in the generated images. In this study, we report on a method for classifying normal images and abnormal images using machine learning in an automatic facial image generation system for game characters based on StyleGAN2, which can automatically generate facial images with expression variations.
    Download PDF (944K)
  • Shogo MURATA, Naoki KITA, Takafumi SAITO
    Session ID: 24-03-05
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    In this study, we developed a run-action game that dynamically generates stages using heart rate, aiming to improve UX and investigate the effect on play. Stages were generated by placing blocks in corridors and rooms, and adjusted according to parameters such as heart rate. As an experiment, four patterns were tested twice for a total of eight participants: manual placement (no adjustment), adjustment using in-game data, adjustment using heart rate, and adjustment using both. The results showed that automatic generation provided more stable enjoyment than manual placement, as scores did not vary. In addition, the study showed that there were large individual differences in heart rate changes, and that it may be more practical to combine this with other factors such as in-game data.
    Download PDF (569K)
  • Ruibo Hou, Yinhao Li, Yen-Wei Chen
    Session ID: 24-03-06
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Pathological image classification is a crucial method in the diagnosis of cancer and inflammation. Classification accuracy has significantly improved with the advancement of deep learning. This paper systematically organizes the progression from linear classification to pre-trained vision models, and further to vision-language models (VLM) using prompt learning. We report on high-precision pathological image classification that efficiently utilizes domain knowledge through the integration of vision and language by VLM and prompt learning.
    Download PDF (282K)
  • Jihong Hu, Yinhao Li, Yen-wei Chen
    Session ID: 24-03-07
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Accurate segmentation of medical images often requires substantial expert annotations, making the labeling process costly and time-consuming. In this study, we present a parameter-efficient fine-tuning strategy for the Segment Anything Model (SAM) to enable label-efficient interactive segmentation in medical imaging. Unlike conventional full-model fine-tuning, our approach utilizes lightweight adaptation techniques—spatial prior adapter (SPA)—that can be trained with only a small fraction of additional parameters. By leveraging SAM’s powerful, general-purpose segmentation capabilities and tailoring it to domain-specific characteristics through minimal parameter updates, we achieve high-quality interactive segmentation results with significantly reduced labeling effort. Experimental evaluations on two medical imaging datasets demonstrate that our method significantly enhances SAM’s adaptability to medical images. Even with a limited number of labels, it achieves superior performance compared to other specialized interactive segmentation models.
    Download PDF (494K)
  • Mao MITSUI, Sora TAKAHASHI, Sora ICHIKAWA, Keiichiro YOSHIDA, Hirokazu ...
    Session ID: 24-03-08
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    In this study, we explored the modification of general-purpose digital camera into a near-infrared (NIR) camera to capture the skin’s surface. By capturing reflectance images of vascular regions, such as the forearm or the side of the foot, and referencing a gray card, changes in reflectance are analyzed to infer blood flow dynamics. This non-contact method has potential applications in first-response medical technologies, such as heatstroke prevention. In this experiment, we compare the proposed method with existing methods such as blood flow velocity measurement and investigate the effect of camera settings on image quality to identify optimal conditions for accurate results.
    Download PDF (442K)
  • Meng LIU, Makoto J. HIRAYAMA
    Session ID: 24-03-09
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    We created an emotional intelligent chatbot system based on large language models (LLMs). This system can switch between multiple characters, and users can choose to interact with the appropriate character according to their needs. As a result, the chatbot's emotional expression and understanding capabilities have been greatly improved. The use of the LangChain framework has simplified the development process of LLMs, and in combination with prompt technology, the chatbot's emotional expression and understanding capabilities have been enhanced. The features of this system are that it realizes character switching through different prompt designs to meet the individualized needs of users, supports multimodal inputs to enable more natural interactions, and also introduces the function of generating narrative content based on photos. With the above characteristics, this system promotes natural interaction between humans and computers and contributes to enhanced emotional understanding and response.
    Download PDF (762K)
  • JOHN SILAS Okello, Rahul KUMAR JAIN, Yen Wei CHEN
    Session ID: 24-03-10
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Hepatic tumors pose significant diagnostic challenges due to their diverse biological behaviors and the inherent limitations of traditional imaging methods. These limitations often result in delayed or imprecise diagnoses, especially in resource-limited settings where advanced tools are not readily available. This study investigates the application of radiomics and Machine Learning (ML) to address these challenges, focusing on the classification of hepatic tumors and the prediction of early recurrence in Hepatocellular Carcinoma (HCC). Radiomic features were extracted from multi-phase CT images and used to train machine learning models, achieving an accuracy of 80% in tumor classification and 71% in HCC recurrence prediction. To ensure clinical relevance and usability, the models were integrated into a user-friendly web application. This integration emphasizes the feasibility of applying advanced machine learning techniques in practical, real-world medical contexts, highlighting their potential to enhance diagnostic precision, especially in resource-constrained environments.
    Download PDF (397K)
  • Shiori KAWASAKI, Takafumi SAITO
    Session ID: 24-03-11
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    We propose a method for easily get the desired formations in the scenes by constructing a system that can automatically extract formations from the dance videos. In this research, we created ‘Formation Diagrams’ that show the dancers’ positions and performed to shaping align the formations symmetrically along a line when necessary. Furthermore, when we conducted experiments using real dance videos, we were able to extract the formation of all dancers in most scenes of performances with a small number of dancers.
    Download PDF (427K)
  • Ryo ISHIBASHIRI, Hideaki MAEHARA
    Session ID: 24-03-12
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    sheep casings are used as sausage casings, and factories manage their diameter and length to improve quality. However, variations in diameter and visual instability make manual inspection difficult. Using image processing, we measured the diameter and found that smaller intestines showed greater shrinkage. The RMSE between the estimated and actual diameters was 0.95 mm, which meets the target of 1 mm, making practical application feasible.
    Download PDF (584K)
  • Yuki KOBAYASHI, Yasushi YAMAGUCHI
    Session ID: 24-03-13
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    The application of deep learning techniques to areas of responsibility requires the development of interpretable models. In this study, we focus on PIP-Net, one of the most promising interpretable image classification models, and aim to verify and improve its behavior. Observations have shown that PIP-Net has a phenomenon where most of the image domain is encompassed in a single concept (prototype) and treated as such. In this paper, we point out the problems with this phenomenon and report that we succeeded in controlling the phenomenon by improving the loss function.
    Download PDF (544K)
  • Shunya KATO, Ikuma SATO, Yasuo ISHIGURE
    Session ID: 24-03-14
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    In recent years, the number of people suffering from and at risk of suffering from lifestyle-related diseases have been increasing. Exercise is believed to be effective in preventing lifestyle-related diseases, and increasing and maintaining physical activity is thought to contribute to the prevention of lifestyle-related diseases and reduction of mortality risk. Although ICT-based assistive technologies for running, which are effective in maintaining and improving health and physical fitness, there are only a limited number of assistive technologies that are effective in improving and maintaining the rate of running. In this study, we investigate running motivation support technology using a virtual running partner with AR glasses, with the aim of providing more effective running motivation. In this report, we describe a method of displaying a 3D virtual partner on AR glasses while running in real space, and the details of an experiment to evaluate its effectiveness and impact.
    Download PDF (470K)
  • Tatsuki UCHIDA, Munetoshi IWAKIRI, Takumi FUJIWARA
    Session ID: 24-03-16
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Simultaneous Localization and Mapping (SLAM) is a fundamental technology for enabling robots to autonomously navigate in unknown environments. Among the various SLAM approaches, 3D LiDAR (Light Detection and Ranging) based SLAM methods have gained significant attention for their ability to achieve high-accuracy localization and mapping in specific conditions. In the future, it will be necessary to develop methods that can handle challenging environments, such as hallways and staircases, which involve elevation changes. Upon investigating reports on existing SLAM methods, it was found that current evaluation techniques have limitations in identifying specific issues inherent in these methods. In this report, we propose a new evaluation method that focuses on analyzing the time-varying changes in six components related to the robot's position and orientation, aiming to clarify the challenges of 3D LiDAR based SLAM methods. As a demonstration, we evaluated two 3D LiDAR based SLAM methods, Fast-LIO and NV-LIOM, using the SubT-MRS dataset. The results revealed that our proposed evaluation method successfully identified challenges in each SLAM method, which were unclear with conventional evaluation approaches.
    Download PDF (795K)
  • Michiru SUGIMOTO, Munetoshi IWAKIRI, Takumi FUJIWARA
    Session ID: 24-03-17
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    During disasters such as the Noto Peninsula Earthquake in 2024, support for victims is provided based on the extent of damage to residential buildings. However, in large-scale disasters, damage assessment surveys are prolonged, and the issuance of disaster certificates is delayed. This study investigates a method to improve the accuracy of building inclination angle estimation using handheld photos and statistical techniques. By generating multiple 3D models and applying a statistical approach, the accuracy was improved, and the target error range (less than ±0.172°) was achieved even with handheld photos. This method contributes to the efficiency and labor reduction of damage assessment surveys.
    Download PDF (5618K)
  • Ryoto TSURUDA, Mutsuo SANO, [in Japanese], [in Japanese]
    Session ID: 24-03-18
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    Download PDF (737K)
  • Haruto OBA, Munetoshi IWAKIRI, Kiyoshi TANAKA
    Session ID: 24-03-19
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    As an approach to generate 3D point clouds, the use of moving picture as input images for SfM-MVS (Structure from Motion-Multi View Stereo) is being investigated. In this approach, it is important to select suitable frames for SfM-MVS from the input video, and methods using optical flow and pHash have been proposed and shown to be effective, respectively. However, these methods do not take into account the quality of the frame itself. Therefore, if the input video contains blurred or shaky frames, point cloud reconstruction by SfM-MVS could be adversely affected. To solve this problem, this study proposes a frame selection method that aims to remove blurred/shaky frames and verifies its effectiveness through experiments.
    Download PDF (994K)
  • Kazufumi KANEDA
    Session ID: 24-03-20
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    I will take a bird’s-eye view of two major research streams over the 40 years I have been involved in. I started my research in Electrical Machinery Laboratory, where two research fields, electromagnetic field analysis and computer graphics, had been conducted. The laboratory turns now into the Visual Information Science Laboratory, which is conducting research on computer graphics, computer vision, machine learning, and medical image processing. Up until now, I have mainly been involved in two research projects on visualization and spectral rendering.
    Download PDF (254K)
  • Mei KODAMA
    Session ID: 24-03-21
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    In recent years, online viewing has been accelerating, and viewing patterns are changing dramatically. In particular, the stability of network and the control of traffic volume are important issues for video distribution systems. However, previous studies have only focused on improving the efficiency of systems for specific viewing services, and have not yet taken traffic countermeasures that fully consider the effects of overall viewing transitions into account. Therefore, we have studied a method for generating viewing models based on viewing trends by age and gender, capturing users’ viewing behavior as a whole. However, since the different viewing patterns of users (TV, recording, and online) were simply added up, there were issues such as the daily viewing limit conditions of users and the combination of different viewing patterns in the current situation. This paper proposed a new method for generating a viewing model based on these viewing conditions. Based on the viewing model, this paper considered the case where the online viewing time is extended in the future by shifting from the current TV viewing and recording to online viewing, and predicted the traffic in the future online viewing, and discussed the period of traffic increase.
    Download PDF (189K)
  • Ryo YAMAMOTO, Mei KODAMA
    Session ID: 24-03-22
    Published: 2025
    Released on J-STAGE: January 20, 2026
    CONFERENCE PROCEEDINGS RESTRICTED ACCESS
    In recent years, video distribution services have been rapidly expanding, and in particular, the explosion of data volume and the increase of power consumption in video distribution systems have become issues. In this study, we focus on cache server(CS)-based video distribution systems and propose a CS control method using a new CS controller to reduce the power consumption. The proposed system consists of distribution server (DS), cache servers (CSs): the valiable-activity CSs and the always-on CS, user terminals, and the CS controller. The system allocates accesses in areas with a low number of viewers to the always-on CS and suspends the variable-operation CS to reduce power consumption. A user viewing model was defined. By evaluation experiments, it was shown that the proposed method can reduce power consumption by approximately 66% compared to the conventional system, and the effectiveness of the proposed method was discussed.
    Download PDF (337K)
feedback
Top