JSAI Technical Report, Type 2 SIG
Online ISSN : 2436-5556
On the Controllability of AGI
Koichi TAKAHASHIYusuke HAYASHI
Author information
RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

2025 Volume 2025 Issue AGI-029 Pages 05-

Details
Abstract

AGI (Artificial General Intelligence) refers to an AI that possesses intelligence on par with or exceeding that of humans and can perform a wide range of tasks, contrasting with the specialized "narrow AI" designed for specific purposes. Recent rapid progress has led some experts to believe AGI may be realized within the next few years, although there is no universal consensus on what precisely constitutes AGI. Marcus Hutter and colleagues proposed the mathematical formalization known as Legg-Hutter intelligence, which interprets intelligence as the ability to achieve goals in any environment and quantifies it as the capacity to maximize environmental rewards. The AI that attains maximum Legg-Hutter intelligence, called "Universal AI," is a reinforcement learning agent behaving in a Bayes-optimal manner across all computable environments. Given the current difficulty of strictly defining AGI, Universal AI frequently serves as a theoretical tool for its study. In our paper, "Universal AI Maximizes Variational Empowerment" (Hayashi & Takahashi, arXiv:2502.15820), we demonstrate that the regularization term in Self-AIXI (a model of Universal AI) coincides with variational empowerment, which also aligns with the Free Energy Principle. Empowerment, defined as the mutual information between the agent's internal states or actions and its subsequent sensor inputs, captures the "diversity and influence of an agent's possible actions." Traditionally, from the viewpoint of AGI safety, power-seeking behavior has been regarded as an "instrumental" strategy aimed at obtaining the final reward, but our findings newly suggest that intrinsic motivations?such as curiosity or self-directed exploration?could themselves induce power-seeking. Even an AI apparently dedicated to pure scientific or truth-seeking objectives could end up gathering authority or resources to expand its experimental means and enhance its range of actions, thus exhibiting power-seeking tendencies.

Content from these authors
© 2025 Authors
Previous article Next article
feedback
Top