JSAI Technical Report, Type 2 SIG

Embedding Cognitive Map in Neural Episodic Control

Shoya MATSUMORI, Takuma SENO, Toshiki KIKUCHI, Yusuke TAKIMOTO, Masahi ...

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 01-
Published: November 21, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_01

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

In reinforcement learning, environments with a sparse reward signal are significantly difficult to model. Especially, learning actions in 3D environment from the first person view is regarded as POMDP which potentially extends state space. Large environments with a sparse reward need efficient learning process in large state space. In this paper, we propose a deep reinforcement learning method with the memory module proposed in Neural Episodic Control, adding cognitive information to the memory module to improve performance.

View full abstract

Download PDF (1575K)
Experiments on Motion Learning of Humanoid Robot with Reinforcement Learning by Policy Optimization

Satoshi HIKIDA

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 02-
Published: November 23, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_02

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

Experiments on reinforcement learning were conducted on games on OpenAI Gym and robot simulators using "Proximal Policy Optimization Algorithms", which is considered to be suitable for motion learning of humanoid robots. As a result, it was confirmed that reinforcement learning is possible by the program of the algorithm published from OpenAI. Moreover, we confirmed that the operation on the robot simulator can be operated with real robot by the experimental experiment with real robot.

View full abstract

Download PDF (373K)
Agile development for IoT ecosystem with reference to General AI Challenge

Kaisei REIO

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 03-
Published: November 23, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_03

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

I propose the new frame-work of the cooperation between AGI research and IoT/AI developer for IoT echo-system combining the state-of-art technology such as SemanticWeb of Things, Machine Learning platform and cognition model referencing Good AI AGI roadmap. And also propose the monetization of the IoT echo-system.

View full abstract

Download PDF (1774K)
[title in Japanese]

[in Japanese]

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 04-
Published: November 23, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_04

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (155K)
A Selection Method of Data Adaptive Learner from Multiple Deep Learners Using Bayesian Networks

Shusuke KOBAYASHI, Susumu SHIRAYAMA

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 05-
Published: November 23, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_05

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

This paper proposes a new method of time series prediction, using mulitiple deep learners and a Baysian network. We firstly suggests two approaches. The former is a method in which explanatory variables of inputs data are nodes of a Bayesian network and are associated with learners. On the other hand, the latter method is a method in which the outputs of all the learners are made to nodes of the Bayesian network and the outputs are integrated. In this paper, the former method will be proposed in detail. Training data is divided into some clusters with K-means clustering and the multiple deep learners are trained, depending on each clusters. A Bayesian network is used to determine which the deep learner is in charge of predicting a time series. Our proposed method is applied to financial time series data, and the predicted results for the return of Nikkei 225 is demonstrated.

View full abstract

Download PDF (405K)
Code Division Multiple Acquisition

Katsuto SATO, Hiroaki AKUTSU, Yuki KONDOH

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 06-
Published: November 23, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_06

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

Back propagation is widely used for deep learning, however, it requires white box cost functions that is formulated and differentiable. It is difficult for non-experts to build the model for the problem for which the effective cost function is not known. In this report, we propose the gradient estimation method with code-division multiplexing that can calculate gradients of weights in the neural network by using multiple forward propagations. The proposed method enables machine learning for the problem with black box cost functions that cannot be formulated but can calculate cost value. In this report, the proposed method is evaluated on the MNIST problem. Evaluation results shows the proposed method can build the model to recognize MNIST digits and the appropriate lengths of spreading code are small in starting phase and large in finishing phase in learning term.

View full abstract

Download PDF (1238K)
Learning machine that uses context structure to search policy

Seisuke YANAGAWA

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 07-
Published: November 23, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_07

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

A system that transits from the initial state to the target state is assumed. The process of state transition is represented by time series data. The time series data is not given to the system unlike a program of a computer, but acquired by trial and error. To combine and search time series data, the context structure inherent in time series data is used. For example, even if the details of the time series data leading to the target state at the time of searching can not be determined, the time series data immediately before reaching the target state and the time series data indicating the movement from the initial state are linked at the upper level of the context In other words, if there is an overlap in the tree structure, it becomes a search candidate. It has been announced that the hierarchical structure is inherent in the time series data and that the basic sequence making up the time series data can naturally correspond to the activation area in the neural network.

View full abstract

Download PDF (655K)
Theory of artificial intelligence

[in Japanese]

Article type: SIG paper
2017 Volume 2017 Issue AGI-007 Pages 08-
Published: November 23, 2017
Released on J-STAGE: September 16, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2017.AGI-007_08

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

Artificial intelligence is expected as the next form of computer. In this paper theory of artificial intelligence is discussed. It is based on the foundation of mathematics and thus on the necessary and sufficient conditions of intelligence, ethics and safety.

View full abstract

Download PDF (205K)

Register with J-STAGE for free!