AGIの制御可能性について

高橋 恒一; 林 祐輔

doi:10.11517/jsaisigtwo.2025.AGI-029_05

Abstract

AGI (Artificial General Intelligence) refers to an AI that possesses intelligence on par with or exceeding that of humans and can perform a wide range of tasks, contrasting with the specialized "narrow AI" designed for specific purposes. Recent rapid progress has led some experts to believe AGI may be realized within the next few years, although there is no universal consensus on what precisely constitutes AGI. Marcus Hutter and colleagues proposed the mathematical formalization known as Legg-Hutter intelligence, which interprets intelligence as the ability to achieve goals in any environment and quantifies it as the capacity to maximize environmental rewards. The AI that attains maximum Legg-Hutter intelligence, called "Universal AI," is a reinforcement learning agent behaving in a Bayes-optimal manner across all computable environments. Given the current difficulty of strictly defining AGI, Universal AI frequently serves as a theoretical tool for its study. In our paper, "Universal AI Maximizes Variational Empowerment" (Hayashi & Takahashi, arXiv:2502.15820), we demonstrate that the regularization term in Self-AIXI (a model of Universal AI) coincides with variational empowerment, which also aligns with the Free Energy Principle. Empowerment, defined as the mutual information between the agent's internal states or actions and its subsequent sensor inputs, captures the "diversity and influence of an agent's possible actions." Traditionally, from the viewpoint of AGI safety, power-seeking behavior has been regarded as an "instrumental" strategy aimed at obtaining the final reward, but our findings newly suggest that intrinsic motivations?such as curiosity or self-directed exploration?could themselves induce power-seeking. Even an AI apparently dedicated to pure scientific or truth-seeking objectives could end up gathering authority or resources to expand its experimental means and enhance its range of actions, thus exhibiting power-seeking tendencies.

Content from these authors

Favorites & Alerts

Corresponding author

Conference information

Register with J-STAGE for free!