Artificial brain(OS) which imitates the behavior of human brain has a thinking algorithm. And The intent of artificial brain(OS) has a hand in thinking process and is able to reflect one's will to the result of thinking. Here are A goal of artificial brain (OS) to reach. 1: Voluntary and self-directed action of speech and motion 2: Understanding of a field of view and conversation.・We can judge that action based on result of thinking after seeing or hearing is appropriate. 3: Obtaining skill and knowledge based on image training・The details of image training may be rough, but artificial brain(OS) can manage to use skill and knowledge in the unexpected scene. 4: Standardization of Artificial brain(OS).
In reinforcement learning, it is difficult to solve high-dimensional multi-objective decision problem. In this paper, we propose a novel hierarchical architecture of deep reinforcement learning with Accumulator Based Arbitration Mode(ABAM). This architecture has some groups which have deep Q-network (DQN) modules respectively and layers containing these modules to solve high-dimensional multi-objective decision problem. And ABAM arbitrates the outputs of modules in one layer by using the outputs of modules in the above layer which has more abstractive objective. By using this architecture, it is able to solve difficult maze task which simple DQN cannot solve.
Dennett cites three stances of physical, design, and intention as a stance for people to understand the object. In interaction with a robot, it has been claimed that smooth communication is performed when a person attributes an intentional stance to a robot. In this research, we made a hypothesis that showing the experiences of forming the behavioral principles of robots contributes to the attribution of the user's intentional stance, and verified it through experiments.
The speaker is developing AGI agents based on the assumption that mature AGI can be achieved by the combination of "the most general AI apart from the efficiency" for generality and "incremental learning" for learning to specialize for efficiency. He applied his AGI agent to Round 1 of General AI Challenge held in 2017, and received the joint 2nd place of the qualitative prize. In this talk, he will explain how to implement the AGI agent, the devices and difficulties when applying it to the Challenge Round 1, and some thought on the General AI Challenge series.
Humans can set suitable subgoals in order to achieve some purposes, and furthermore, can set sub-subgoals recursively if needed. It seems that the depth of the recursion is unlimited. Inspired by this behavior, we have designed a new hierarchical reinforcement learning architecture, the RGoal architecture. The algorithm is designed to solve the MDP on the augmented state- action space. The action-value function becomes shareable among multi-tasks due to the value function decomposition. The sharing accelerates learning in multi-task setting. The mechanism named "think-mode" is a kind of model-based reinforcement learning. It combines learned simple tasks in order to solve inexperienced complicated tasks quickly, or in zero-shot in some cases. The algorithm is realized by a flat table and repetition of simple operations, without a stack. Hereafter, we will extend this architecture, and will build the model of the information processing mechanism of the prefrontal cortex in the brain.
Bayesian network is a promising model of cerebral cortex. However, in ordinary Bayesian networks, the number of parameters increases exponentially against the number of parent nodes in each conditional probability table (CPT), which prohibits employing a large-scale Bayesian network. Restricting CPTs is an approach for scaling-up Bayesian networks. In this paper, we restrict CPTs to noisy-OR and noisy-AND gates, both of which have O(N) parameters against the number of parent nodes N. In order to investigate the representational power of this Bayesian network, we construct a network which can pool the features of the input data, by mimicking the early visual cortex. In this model, a gating mechanism is realized by the noisy-AND gates. We show that the model can acquire translation-invariant responses by the standard gradient ascent method.
Reinforcement learning using deep learning and Monte Carlo tree search has been reported to be extremely effective as an artificial intelligence algorithm that is used in AlphaZero etc. and is widely applicable to various games. Since this method is essentially an algorithm that solves the search problem efficiently, it is possible to solve a general combination optimization problem as well as a game. Therefore, in order to deepen the understanding of this method, experiments were applied to combinatorial optimization problem, and the results are reported. The relationship between this method and the frame problem also be described.