AIアライメント言語: Align AIアライメントパラダイム構築に向けて

田森 佳秀; 吉澤 駿; 茂木 健一郎

doi:10.11517/pjsai.JSAI2024.0_4M1GS1004

Abstract

AI Alignment is a field of research that aims to make AI operate in accordance with human ethics, values and goals. We are developing a programming language, or 'alignment language', for designing AI to operate according to specific goals and ethics. This alignment language provides specific rules and structures for aligning AI behaviour and decision criteria with human ethics and goals; AI developers can use it to clearly define AI goals and behaviours, and minimise the risk of AI acting contrary to human intentions. The language can also be used to design prompts for AI to acquire the ability to adapt to its environment and situation. We are currently in the process of designing and implementing this language and are facing several challenges. For example, how to incorporate the diversity of human ethics and values into the AI, how flexible the AI's decision criteria should be, and how the AI should respond to unknown situations. The talk will present and discuss the structure of an alignment language for designing alignments that address these issues.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!