人工知能学会論文誌

一般論文

原著論文

Student-t VAEによるロバスト確率密度推定

高橋大志, 岩田具治, 山中友貴, 山田真徳, 八木哲志, 鹿島久嗣

原稿種別: 原著論文
2021 年 36 巻 3 号 p. A-KA4_1-9
発行日: 2021/05/01
公開日: 2021/05/01

DOIhttps://doi.org/10.1527/tjsai.36-3_A-KA4

ジャーナルフリー

抄録を表示する抄録を非表示にする

We propose the Student-t variational autoencoder (VAE), which is a robust multivariate density estimatorbased on the VAE. The VAE is a powerful deep generative model, and used for multivariate density estimation. Withthe original VAE, the distribution of observed continuous variables is assumed to be a Gaussian, where its mean andvariance are modeled by deep neural networks taking latent variables as their inputs. This distribution is called thedecoder. However, the training of VAE often becomes unstable. One reason is that the decoder of VAE is sensitiveto the error between the data point and its estimated mean when its estimated variance is almost zero. To solve thisinstability problem, our Student-t VAE uses a Student-t distribution as the decoder. This distribution is a heavytaileddistribution, of which the probability in the tail region is higher than that of a light-tailed distribution such as aGaussian. Therefore, the Student-t decoder is robust to the error between the data point and its estimated mean, whichmakes the training of the Student-t VAE stable. Numerical experiments with various datasets show that training ofthe Student-t VAE is robust, and the Student-t VAE achieves high density estimation performance.

抄録全体を表示

PDF形式でダウンロード (1192K)
アンサンブル学習とLDAの統合による動画広告効果の要因分析

崎濱栄治, 川崎泰一, 本橋永至

原稿種別: 原著論文（実践AIシステム論文）
2021 年 36 巻 3 号 p. B-K91_1-8
発行日: 2021/05/01
公開日: 2021/05/01

DOIhttps://doi.org/10.1527/tjsai.36-3_B-K91

ジャーナルフリー

抄録を表示する抄録を非表示にする

With the widespread use of highly functional smartphones and the improvement of communication environments,video advertising is becoming widely used in the mobile advertising domain. When creators create videoadvertisements, if they know in advance the most effective components and combinations, they are more likely to beable to produce them more efficiently. For mobile ad images, [Sakihama 19b] interpreted the results of a click-rateprediction model using Gradient Boosted Decision Trees (GBDT) and Interpretable Trees (inTrees) [Deng 19].
In this paper, we propose a multimodal approach to analyzing the factors of advertising effectiveness, whichconsists of ad delivery logs, components of video ads, and text information. Specifically, we propose a method forverifying the effectiveness of video advertisements in mobile advertising based on computer vision and a method forsupporting the production of video advertisements using the modeling results of Latent Dirichlet Allocation (LDA),XgBoost [Chen 16], and defragTrees [Hara 18]. This method is expected to be faster and simpler than the oneproposed by [Sakihama 19b], and is likely to enable rule extraction. Computer vision and machine learning will enableautomatic feature extraction, identification of effective components and interactions, and contribution measurement.It is expected to be applied to a wide range of fields other than video advertising.

抄録全体を表示

PDF形式でダウンロード (1010K)
行動時刻を考慮した条件付き変分オートエンコーダによる推薦システム

保住純, 岩澤有祐, 松尾豊

原稿種別: 原著論文
2021 年 36 巻 3 号 p. C-KB7_1-10
発行日: 2021/05/01
公開日: 2021/05/01

DOIhttps://doi.org/10.1527/tjsai.36-3_C-KB7

ジャーナルフリー

抄録を表示する抄録を非表示にする

In this study, we propose a method for adding time of action information to a Variational Auto-encoder (VAE)-based recommendation system. Since time of action is an important information to improve the accuracy of recom-mendation, many methods have been proposed to use the information of time of action, such as purchase or reviewof a product, for recommendation. And VAE-based recommendation systems have been reported to be more accu-rate and robust for small data sets compared to traditional deep learning-based recommendation systems. Existingresearch on introducing time information into VAEs includes a method of weaving information on the order in whichproducts are preferred by passing the encoding layer consisting of RNN, but the time information of the productpreferred is not considered. If the absolute time information is not taken into account when recommending a product,for example, when a temporary boom causes many users to prefer a particular product, it may be judged to be a pref-erence based on the user’s preferences, which may adversely affect the recommendation results. Based on the aboveproblems, this study examines a VAE-based recommendation system to improve the recommendation accuracy byadding time information of each action to the input information, and finally proposes Time-Sequential VAE (TSVAE)and confirms its accuracy. In addition, to verify how to add time information to improve the accuracy, we conductedexperiments using multiple models with and without absolute time information and different encoders of time intervalinformation, and evaluated the accuracy.

抄録全体を表示

PDF形式でダウンロード (1557K)
時間変化に関する外部情報を考慮した非定常多腕バンディット問題

難波博之

原稿種別: 原著論文
2021 年 36 巻 3 号 p. D-K84_1-11
発行日: 2021/05/01
公開日: 2021/05/01

DOIhttps://doi.org/10.1527/tjsai.36-3_D-K84

ジャーナルフリー

抄録を表示する抄録を非表示にする

Multi-armed bandit problem is a fundamental mathematical problem in sequential optimization and reinforcementlearning that has a variety of application such as online recommendation system and clinical trial design. Multiarmedbandit problem can describe a situation in which a player tries to select a good choice sequentially from givencandidate choices to maximize the cumulative reward. In this paper, we consider the non-stationary multi-armed banditproblems. Non-stationary means the reward distribution of each arm varies with time. We point out that in somereal application, we can utilize information on the change of reward distribution. Especially we consider the type ofinformation that may restrict the rounds at which the reward distribution changes. Against such scenario, we proposea novel strategy called PM policy. The proposed policy is based on existing CUSUM-UCB policy and M-UCB policythat do not consider external information. Though such existing policies monitor all arms to detect the change ofreward distribution, our policy monitors only important arms and rounds. As a result, the ratio of unnecessary monitoringis reduced, and an efficient search can be performed. The regret bound of the proposed policy is described. Wealso show the effectiveness of the proposed method by numerical experiments.

抄録全体を表示

PDF形式でダウンロード (1077K)