We propose a method which acquires a purposive behavior for a mobile robot to shoot a ball into the goal by using the Q-learning, one of the reinforcement learning methods. A mobile robot (an agent) does not need to know any parameters of the 3-D environment or its kinematics/dynamics. Information about the changes of the environment is only the image captured from a single TV camera mounted on the robot. Image positions of the ball and the goal are used as a state variable which shows the effect of an action taken by the robot during the learning process. After the learning process, the robot tries to carry a ball near the goal and to shoot it. Computer simulation is used not only to check the validity of the method but also to save the learning time on the real robot. The real robot succeeded in shooting a ball into the goal using the learned policy transferred to it.
This paper describes an experimental results based on our prior-proposed scheme : learning of sensory-based, goal-directed navigation. The emphasis is that learning a task-based behavior could be formulated as an embedding problem of dynamical systems : desired trajectories in a task space should be embedded into an adequate sensory-based internal state space so that an unique mapping from the internal state space to the motor command could be established. In this study, two types of neural architecture were tested for constructing such an adequate internal state space based on sensory information. A simple homing task was conducted on Feed Forward Neural Network that is fed with a regression of sensory sequence as time-delayed manner. A task of complex cyclic routing was conducted on Recurrent Neural Network (RNN) . In both cases, sufficient supervised training of robots generated rigid internal structure of attractor dynamics which realized desired navigations in robust manner, even against limited environmental changes as well as miscellaneous noises in the real world. It was confirmed that RNN architecture has more flexiblity adapting to various tasks.
For learning behaviors of real robotic systems, we propose a method to extend reinforcement learning to the domain of continuous robot control problems. Reinforcement learning methods are applicable to a variety of fields, however generally they can handle only discrete states and actions and they sometimes require impractical amount of time to learn practical problems. In the proposing method, one of the connectionist models CMAC is employed for utility networks in order to represent continuous states and control variables. In order to fully utilize experiences and progress fast learning, experience sequences are stored while actual actuation and they are replayed with priorities afterall. As a testbed, the learning system is applied in simulation to the control of swing amplitude of a two-link brachiation robot which is hardly constrained with dynamics.
Many studies on the learning control of the robot arm have been conducted by using neural newtworks. The method that uses an acquired inverse-kinematics model of the arm by learning are popular. However, acquisition of the inverse kinematics model has a number of drawbacks. Furthermore, a limited scale neural networks system has only limited precision. Errors still remains in the output of the inverse kinematics model using the neural networks system. In this paper, a new method for solving inverse kinematics problem using the learned inverse model of the linearized model as output feedback system is proposed. Two possible configurations of the system are presented. The use of linear adaptive systems including Kalman filter is also proposed for higher accuracy. The performances of the proposed methods are shown by numerical simulations.
This paper introducesa method to extract a task knowledge from an example shown by an operator or generated by a planner, and to apply the knowledge to plan apath in similar environments. Our aim is to make efficient use of learning task resource and to plan a path in complex environments easily. Our method is based on the A* algorithm. We developed a technique to generate a suboptimal path with quite less amount of search nodes than the traditional A* algorithm, and to make a heuristic function that includes task knowledge. Examples are shown to verify the effect of our method.
This paper examines feasibility of a space robot system which captures a free-flying target. First, to examine it experimentally, we develop a berthing dynamics simulator (BDS) on which the space robot and the target behave as if they were on orbit. BDS measures force and torque operating on the real hardware, and it calculates the motion of the robot and the target using the detected force, then it drives the servo mechanisms so that they conform to the calculated posture. Next, to examine the feasibility by use of only numerical simulation, we propose a mathematical model. The validity of the model is verified in comparison with the experiments performed on BDS. Finally, it is confirmed that the space robot can automatically capture the free-flying target without yielding excessive force and oscillation by the experiments and the numerical simulation.
This paper describes a high-performance robot vision system that performs real-time tracking of moving objects and real-time optical flow computation. The system is implemented as a transputer-based vision system augmented with a high-speed correlation processor. The vision unit is equipped with three image frame memories, each of which can be used simultaneously for image input, image processing, and image display. Thus, the system can devote all its computation power to image processing without waiting for image input or image display. The vision board is also equipped with a correlation processor. Using the chip, we have developed a very fast correlation based robot vision system. This system can also be used as multi-processor configuration to greatly increase performance.
The number of human requiring the rehabilitation for the fracture of a bone and joint disease caused by the traffic accident and cerebral apoplexy and so on will be increased with the advanced aging society. In the exercise for restoration of function in the rehabilitation, some automatic recovery machines have already been used just for the relatively easy exercise. Moreover, for the complex exercise and evaluation of muscle strength, a multi d.o.f. manipulator is required. However, this kind of robots have not been much developed. Because they require the essential function such as safety and flexibility which are not common in general industrial robots. In this study, we focus on a rubber artificial muscle and apply 2 d.o.f. manipulator using this actuator to the rehabilitation robot for the exercise for restoration of function. By applying the impedance control scheme containing a two d.o.f. PID control system to a positioning loop, a unified control realizes several exercise motion modes required to the execise and evaluation. Through some experiments of trajectory tracking, compliance, and damping control the satisfactory control performance can be obtained. Further, the proposed control system can realize the all exercise motion modes.
This paper proposes a new active sensor system (Active Antenna) which is motivated by the antenna of an insect. Insects use their antennae skillfully so that they can aviod hitting objects, while they are crawling, running, and even when they are staying still. The antenna can be characterized by its“flexibility”and“active motions”. The flexibility of the antenna may contribute to the reduction of the impulsive force which appears when the antenna contacts an object unexpectedly. Active motions may be useful for extending the sensing environment in a 3D space. While the antenna possesses all these inherent advantages, we propose a new active sensing system, in which both flexibility and active motions contribute essentially to the contact point sensing between the antenna and an obstacle. The sensor system consists of only four components, an insensitive flexible beam, a position sensor, a torque sensor, and an actuator. We show that this simple sensor system can locate any contact point between the insensitive flexible beam and an object by applying an active motion to the beam. We also show that this sensor system enables us to detect the shape of the object when it is mounted on the front part of a mobile robot.
This paper describes a path planning method of a 2 link planar manipulator in such tasks that its end-effector is commanded to move one end of an elastic plate spring, whose another end is fixed to a wall, to a desired position by bending the spring. For this method, “a compliance ellipse of manipulation”defined in this paper is used. Computer simulations make it clear that the proposed path planning method gives a satisfactory trajectory of the end-effector, which is similar to the trajectory given by human's operation, in the meaning that the joint torque needed to do the task is reduced.