人工知能学会第二種研究会資料

劣モジュラ最適化

岩田覚

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 01-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_01

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

劣モジュラ関数は，凸関数の離散版に当たる集合関数であり，組合せ最適化を始めとして，情報理論，待ち行列理論，ゲーム理論等，数理工学の様々な分野で頻繁に現れる．ネットワークのカット容量関数，マトロイドの階数関数，多元情報源のエントロピー関数などが劣モジュラ性を有する．本講演では，劣モジュラ関数に関する最適化問題の基礎から最新の成果までを紹介する．

抄録全体を表示

PDF形式でダウンロード (69K)
ラベル信頼度を利用したブースティング手法

中田康太, 櫻井茂明, 折原良平

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 02-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_02

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

We address a novel and realistic Label Reliability Problem that belongs to the field of supervised learning, where con dence of labeling is different for each training set. Our main idea is to make more precise classi ers by dealing with reliably and not reliably labeled sets seperately. We focus on a novel boosting method that utilizes reliably labeled data. The theoretical investigation on the method makes clear its relation to soft margin approach, cost-sensitive learning and semisupervised learning. We perform detailed experiments that include the boosting method and 8 related methods. The results suggest the superiority of our approach that counts on unreliable labels.

抄録全体を表示

PDF形式でダウンロード (279K)
否定枝を含むshared BDD上で動作するEMアルゴリズム

石畠正和, 亀谷由隆 , 佐藤泰介, 湊真一

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 03-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_03

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

We extend the BDD-EM algorithm, which is an expectation-maximization (EM) algorithm working on binary decision diagrams (BDDs), for shared BDDs (SBDDs) with negative edges. BDDs are used as a compact expression of Boolean formulas and moreover the use of SBDDs and negative edges is expected to reduce the time and space in the case where we have some similar partial structures. We show that the proposed algorithm which utilizes SBDD with negative edges for bipartite noisy-OR network reduces time and space in executing the EM algorithm.

抄録全体を表示

PDF形式でダウンロード (467K)
疎な相関グラフの学習による相関異常の検出

井手剛

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 04-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_04

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

ノイジーなセンサーデータからの異常検知の問題を考えた時、センサー同士の依存関係に現れる異常の検出は、実用上重要かつ困難な問題である。困難の由来はおおむね２つにまとめられる。ひとつは、センサー間の相関がノイズに対しきわめて脆弱なため、異常の兆候をノイズから切り分けるのが難しい点である。2 つ目は、複数の変数ペアにおいて何かの異常が観測されたとしても、その情報を個々のセンサーの異常度に帰着させるのが簡単でない点である。本論文では、前者への解決策として疎な構造学習の手法を用いることを、また後者に対しては、グラフィカル・ガウシアン・モデルから情報論的に自然に導かれる相関異常スコアを用いることを提案する。

抄録全体を表示

PDF形式でダウンロード (1346K)
分類器のクラスタリングによる学習対象変化のマイニング

西田京介, 山内康一郎

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 05-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_05

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

The demand for learning machines that can adapt to concept change, the change over time of the statistical properties of a target variable, has become more urgent. We propose a system in which multiple online and offline classifiers are used for learning changing concepts. Experiments with synthetic concept-drifting and concept-shifting datasets show that clustering classifiers enables our proposed system to understand the sequence and similarity of past concepts.

抄録全体を表示

PDF形式でダウンロード (891K)
転移学習のサーベイ

神嶌敏弘

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 06-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_06

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

We overview resent researches in transfer learning.

抄録全体を表示

PDF形式でダウンロード (476K)
GAMとその周辺

田中祐輔 , 伊庭克拓 , 竹澤邦夫 , 辻谷将明

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 07-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_07

研究報告書・技術報告書フリー

PDF形式でダウンロード (1871K)
消費者行動モデル研究の変遷と大規模データの利活用へ向けて

石垣司, 本村陽一

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 08-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_08

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

The research of consumer model aims to explain a common fact of human behavior at consumption. Recently, large scale data of human behavior records in daily life or shopping such as POS data can be observed by a development of sensor networks or ubiquitous system. However, the a nity of the consumer behavior model and the large scale data is poor. The present paper surveys a transition of consumer behavior model and describes the looking toward an efficient utilization of large scale data with consumer behavior model.

抄録全体を表示

PDF形式でダウンロード (276K)
CF-Suffix Trie を用いた頻出移動パターンマイニング手法

稲田泰裕, 池田大輔, 鈴木英之進

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 09-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_09

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

In this paper, we propose CF-Suffix Trie for mining frequent moving patterns from spatiotemporal data and an online algorithm for constructing the trie. Our methods can discover patterns and their related spatial regions automatically with only a single scan of the data. We evaluate our methods experimentally using datasets with artificial object trajectories. The performance experiment shows that our method are more than 1000 times fater than naive methods and exhibits more than 95 % of precision.

抄録全体を表示

PDF形式でダウンロード (401K)
長大な単一系列データにおける頻出飽和系列の高速マイニング

村田拓也, 岩沼宏治 , 鍋島英知

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 10-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_10

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

In this paper, we propose an e cient algorithm for mining frequent right-closed sequences from a single long sequences without candidate generation. The purpose is to compress a huge set of frequent sequences, extracting in mining process. To measure the frequency of each sequences, we adopt the Head frequency which can count multiple occurrences of each subsequence without irrational duplications. The search space can largely be reduced by applying right-anti-monotonicity of the head frequency.

抄録全体を表示

PDF形式でダウンロード (215K)
精度保証付きオンライン型高速近似系列マイニング

村田順平, 岩沼宏治, 石原龍一, 鍋島英知

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 11-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_11

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

We propose an on-line approximation high-speed algorithm for extracting frequents ubsequences from a stream data. In an on-line algorithm, suppressing memory consumption is very important, thus, an on-line algorithm often takes a form of approximation algorithm, where the error ratio should be guaranteed to be lower than a user-specified threshold value. Our Algorithm is based on LOSSY_COUNTING Algorithm[1, 4], which is famous and can extracts frequent items from a stream data. We extend LOSSY COUNTING Algorithm to extract frequent subsequences from a stream data by using Head Frequency that is a measure for frequency of subsequences. We estimate approximation accuracy of the proposed algorithm and the space complexity. The order of memory consumption is M/ϵ logN, where M is the maximum number of subsequences obtained in each window, ϵ is an user-specified error ratio, and N is the length of a stream data. Through experiments we show that the proposed algorithm has good scalability to the length of a stream data and can suppress the memory consumption being lower than the estimated value

抄録全体を表示

PDF形式でダウンロード (250K)
時系列テキストデータからの時間的出現依存関係に基づく重要単語の抽出

多田知道, 岩沼宏治 , 鍋島英知

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 12-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_12

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

This paper shows a new method of extracting important words from newspaper corpus based on the temporal-dependency between word occurrences. This word extraction method plays an important role in event-sequence mining. TF·IDF is a well-known method to rank word's importance in a document. We already proposed a new word-extraction method of improving TFIDF method, called TF·IDayF,which considers temporal information of word occurrences and can extract important/characteristic words of expressing sequential events. However, this method does not consider any temporal dependency of word occurrences, which can be regarded as some causal relationships. In this paper, we propose a novel method for extracting important words by using temporal co-occurrence information of words in a newspaper corpus.

抄録全体を表示

PDF形式でダウンロード (267K)
サポートベクトル回帰におけるDecremental アルゴリズムの一般化に関する一考察

烏山昌幸, 竹内一郎, 中野良平

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 13-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_13

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

Incremental and decremental algorithm of the Support Vector Machines (SVM) [1, 2, 3] efficiently updates trained SVM parameters whenever a data point is added to or removed from the training set. When we need to add or remove many data points, the computational cost of these methods becomes inhibitive because we have to repeatedly apply the method for each data point. In this paper, we generalize the existing decremental algorithm of Support Vector Regression (SVR) [2, 3] in such a way that several data points can be removed more efficiently. In our proposed approach, which we call generalized decremental SVR (GDSVR), we consider a path-following problem in multi-dimensional parameter space. The experimental results show thatGDSVR can reduce the computational cost of leave-m-out cross-validation (m > 1). In particular, we observed that the number of breakpoints, which is the main computational cost of the involved path-following, were reduced from O(m) to O(√m).

抄録全体を表示

PDF形式でダウンロード (340K)
交差検定法に基づく一般化情報量規準の近似計算

力徳正輝

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 14-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_14

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

We propose the approximated calculation for the Generalized Information Criteria(GIC) in which the influence function is approximately calculated based on cross validations. By this method, We can estimate the GIC for the L1 regularized log linear model and Support Vector Machine. The GIC for these models have never been estimated exactly. The experiments shows that proposed approximated GIC is effective to estimate the valid regularized parameter for models..

抄録全体を表示

PDF形式でダウンロード (183K)
分配関数のビリーフプロパゲーションによる計算とグラフ多項式

渡辺有祐 , 福水健次

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 15-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_15

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

The Bethe approximation, or loopy belief propagation algorithm is a successful method for approximating partition functions of probabilistic models associated with a graph. Chertkov and Chernyak derived an interesting formula called "Loop Series Expansion", which is an expansion of the partition function. The main term of the series is the Bethe approximation while other terms are labeled by subgraphs called generalized loops. In our recent paper, we derive the loop series expansion in form of a polynomial with coefficients positive integers, and extend the result to the expansion of marginals. In this paper, we give more clear derivation for the results and discuss the properties of the polynomial which is introduced in the earlier paper.

抄録全体を表示

PDF形式でダウンロード (295K)
社会ネットワークにおける汚染拡散最小化のためのリンク封鎖

木村昌弘, 斉藤和巳, 元田浩

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 16-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_16

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

We consider the problem of minimizing the spread of undesirable things, such as computer viruses or malicious rumors, by blocking a limited number of links in a network, a converse problem to the influence maximization problem in which the most influential nodes for information diffusion are searched in a social network. This minimization problem is another approach to the problem of preventing the spread of contamination by removing nodes in a network. We propose a method for efficiently finding a good approximate solution to this problem based on a naturally greedy strategy. Using large real networks, we demonstrate experimentally that the proposed method significantly outperforms conventional link-removal methods. We also show that unlike the case of blocking a limited number of nodes, the strategy of removing nodes with high out-degrees is not necessarily effective for our problem.

抄録全体を表示

PDF形式でダウンロード (229K)
カスケードモデルからの少数ルール群選択による高性能分類器の構築

大村隆晴, 中野優, 岡田孝

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 17-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_17

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

The characteristic rule induction usually produces a large number of rules, and it is difficult for a user to inspect all rules. This paper describes a method to give a priority index to rules based on their supporting instances, and it guides a user to inspect the most useful rule successively. The priority index is calculated dynamically at each step of rule set inspection using the covered instances by the employed rules, and the resulting rule group obtained gives a concise understanding to the data of the target class.

抄録全体を表示

PDF形式でダウンロード (352K)
分子グラフにおける構造差異の定量的表現

大田黒空, 高橋由雅

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 18-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_18

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

We propose several simple numerical indices for representing structural difference between molecular graphs, which is based on vertex difference and edge difference. In this work, we simply employed molecular graphs which all of their simple graph representations are isomorphic. The vertex difference describes the difference in terms of atom type on the same simple graph molecular framework. The edge difference describes the difference in terms of bond type. Alternatively, chemical structure difference was also defined that describes the difference of both atom type and bond type. We employed these indices for similar structure searching. The results showed that the structure-difference based searching gives us similar structure searching that is considerably different from conventional methods.

抄録全体を表示

PDF形式でダウンロード (324K)
部分グラフ同型判定のためのグラフスペクトル最適化手法

グェンズィヴィン , 大原剛三 , 鷲尾隆

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. 19-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_19

研究報告書・技術報告書フリー

抄録を表示する抄録を非表示にする

Solving subgraph isomorphism problem has emerged as a major part in study of graph mining, while its computational complexity is known to be NP-complete. To address this complexity issue, a subgraph isomorphism checking approach between two graphs has been assessed by applying Cauchy's Interlace theorem to symmetric matrices such as adjacency matrices representing the graphs because of its low computational complexity O(n3). However, the accuracy of this approach is known to be low, when we simply assign edge label IDs in the graphs to the elements of their adjacency matrices. In this paper, we propose a novel approach called OPTSPEC (OPTimized SPECtra for subgraph isomorphism checking) which optimizes a mapping from a substructure of graphs to an element of their adjacency matrices so as to maximize the effectiveness of Interlace theorem. We experimentally evaluated our approach by using artificial graph data, and confirmed the significant accuracy of its subgraph isomorphism checking.

抄録全体を表示

PDF形式でダウンロード (463K)
データマイニングと統計数理研究会（第 9 回）目次

データマイニングと統計数理研究会

原稿種別: 研究会資料
2009 年2009 巻DMSM-A803 号 p. c01-
発行日: 2009/03/03
公開日: 2021/08/28

DOIhttps://doi.org/10.11517/jsaisigtwo.2009.DMSM-A803_c01

研究報告書・技術報告書フリー

PDF形式でダウンロード (187K)

J-STAGEへの登録はこちら（無料）