Reports of the Technical Conference of the Institute of Image Electronics Engineers of Japan
Online ISSN : 2758-9218
Print ISSN : 0285-3957
Reports of the 308th Technical Conference of the Institute of Image Electronics Engineers of Japan
Session ID : 23-04-085
Conference information

A Fundamental Study on 3D CG Image Quality Assessment in Vision & Language Based on Stable Diffusion
*Norifumi KAWABATA
Author information
CONFERENCE PROCEEDINGS RESTRICTED ACCESS

Details
Abstract
GPT-4, which is a multimodal large-scale language model, was released on March 14, 2023. GPT-4 is equipped with Transformer, a machine learning model for natural language processing, which trains a large neural network through unsupervised learning, followed by reinforcement learning from human feedback (RLHF) based on human feedback. Although GPT-4 is one of the research achievements in the field of natural language processing (NLP), it is a technology that can be applied not only to natural language generation but also to image generation. However, specifications for GPT-4 have not been made public, therefore it is difficult to use for research purposes. In this study, we first generated an image database by adjusting parameters using Stable Diffusion, which is a deep learning model that is also used for image generation based on text input and images. And then, we carried out experiments to evaluate the image quality from the generated database, and discussed the quality assessment of the image generation model.
Content from these authors
© 2024 by The Institute of Image Electronics Engineers of Japan
Previous article Next article
feedback
Top