Transactions of Japan Society of Kansei Engineering
Online ISSN : 1884-5258
ISSN-L : 1884-0833
Original Articles
What is the Difference between Image Cognition in Humans and Computer Vision?
Hiroshi OMORIKazunori HANYU
Author information
JOURNAL FREE ACCESS

2024 Volume 23 Issue 2 Pages 107-117

Details
Abstract

There exist some Computer Vision Models (CVMs) such as CNN, Vision Transformer (ViT), and CLIP, which were pre-trained on a huge amount of training data. The image cognition power of these CVMs is very high. In our environmental cognition research using photos, we manually measured the inter-photo visual similarity. Our previous study found that CVM-based photo similarity and visual similarity were quite similar, when compared by photo MDS. However, it was also suggested that the difference in image cognition between humans and CVM was related to representation of humans. We investigated here numerically in detail the difference between CVM-based photo similarity and visual similarity, using six types of photo sets. The influence of representation could be evaluated by cluster size on MDS. It was shown that representation influences the cognition of shrines and temples, foods, insects, buildings, greens, garden styles, perspective views, night views, the symbol tree, and so on.

Content from these authors
© 2023 Japan Society of Kansei Engineering
Previous article Next article
feedback
Top