テキスト駆動型属性操作に基づく生成的自動データ拡張の検討

関根 理敏; 新原 敦介; 明神 智之; 今谷 恵理

doi:10.11517/pjsai.JSAI2024.0_2L6OS19b01

Abstract

Unlike conventional software, AI software is developed inductively from training data. Therefore, preparing high-quality training data is crucial. Conventional automatic data augmentation methods primarily perform augmentation by directly manipulating the original data through means such as rotation and cropping, or by manipulating latent variables corresponding to the original data. These methods do not optimize data augmentation by manipulating various interpretable attribute information within the dataset. In this paper, we propose an automatic data augmentation method that generates new data by representing the attribute values of the original dataset in a text format. This method manipulates these attribute values to ensure data sufficiency and coverability by attribute value. Our proposed method optimizes data augmentation by learning how to manipulate textual attributes in ways that maximize the classification accuracies by attribute values and the naturalness of the textual data. This approach is expected to improve the overall quality of the dataset. We plan to implement and evaluate our proposed method to verify its effectiveness.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!