抄録
Recently, graphic boards have higher performance with development of 3DCG and movie processing than CPU, and widely used with progress of computer entertainment. Implementation of the General-purpose computing on GPU (GPGPU) become more easier by the integrated development environment, CUDA distributed by NVIDIA. GPU has dozens or a hundred arithmetic circuits, whose allocations are controlled by CUDA. In the previous researches, the implementation of the neural network using GPGPU have been studied, however the learning of networks was not mentioned because the GPU performance is low in conditional processing whereas high in linear algebra processing. Therefore we have proposed two methods. At first, a whole network is implemented as a thread, and some networks are taught in parallel to shorten the time necessary to find the optimal weight coefficients. Secondly, this paper introduces parallelization in the neural network structure, that is, the calculation of neurons in the same layers can be paralleled. And the processes to teach for same network with different patterns are independent also. As a result, the second method is 20 times faster than CPU, and compared with the first proposed method, that is about 6 times faster.