特性を顕在化する言語の意味を反映した画像生成

渡邊 清子; カナシロ ペレイラ リズ; 小林 一郎

doi:10.11517/pjsai.JSAI2021.0_2Yin522

Abstract

Although recent text-to-image models achieved great success on generating images from the description of an object, such as a bird with brown and black striped wings and a yellow beak", these models may still struggle to generate images based on the understanding of the attributes of the object. We propose a text-to-image model that better reflects the meaning of words that express an object's attribute (i.e., adjectives). More specifically, we consider the case where the vector representation of shoes' images are changed with four adjectives, i.e., sporty, comfortable, pointy, and open, and we generate images that better reflect the meaning of these adjectives.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!