JSAI Technical Report, Type 2 SIG

[title in Japanese]

Hiroki ARIMURA

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 01-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_01

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

[in Japanese]

View full abstract

Download PDF (850K)
Statistical Causal Inference and Causal Discovery

Yutaka KANO, Masashi MIYAMURA

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 02-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_02

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

Statistical methodology and its difficulty for causal inference are first reviewed in experimental, quasi-experimental or observational studies. The review includes path analysis,graphical modeling, Rubin's counterfactual models and propensity scores. It is then studied how those statistical methods can relate with causal discovery in data mining in computational science.

View full abstract

Download PDF (224K)
Dimension Reduction for Supervised Ordering

Toshihiro KAMISHIMA, Shotaro AKAHO

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 03-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_03

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

Ordered lists of objects are widely used as representational forms. Such ordered objects include Web search results and best-seller lists. Techniques for processing such ordinal data are being developed, particularly methods for the supervised ordering task: i.e., learning functions used to sort objects from sample orders. In this article, we propose dimension reduction methods specifically designed to improve prediction performance in supervised ordering tasks.

View full abstract

Download PDF (309K)
Optimization Approaches for Large-Scale Semi-Supervised Learning

Yasutoshi YAJIMA

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 04-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_04

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

We present optimization approaches for semi-supervised learning for classification based on the formulations of Support Vector Machine (SVM) for the conventional supervised setting. We first introduce the Laplacian of a graph and the associated graph kernels which are exploited in many semi-supervised classification methods. We will show that these methods can be naturally derived from the conventional formulations of SVMs with the graph kernels. The proposed optimization problems fully enjoy the sparse structure of the graph Laplacian, which enables us to optimize the problems with a large number of data points in a practical amount of computational time. Some numerical results indicate that our approaches achieve fairly high performance on large scale problems.

View full abstract

Download PDF (317K)
Semi-structure Mining for Tree Structure with Sequence Node

Issei SATO, Hiroshi NAKAGAWA

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 05-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_05

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

In text mining,aword frequency is an important element.Moreover,when we extract a relationship between words,it is important to extract dependecy structure in a sentence. This paper proposes asemi-structure mining method extracting frequent words with dependecy structure in alarge number of text data.Our method identifies dependency as tree structure whose node is a sequence.In this way,our proposed method can extract patternswhich the conventional method can not extract.

View full abstract

Download PDF (359K)
Investigation of the method for searching and deciding the order in ARX model

Kenta FUKATA, Takashi WASHIO, Katutoshi YADA, Hiroshi MOTODA

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 06-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_06

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

An Auto-Regressive eXogenous input (ARX) model has been widely used in engineering fields to model dynamic response of a system to exogenous factors. A difficulty in this modeling is the determination of an appropriate model order for given data. In this paper, we develop a new and practical approach to determine the appropriate order. Moreover, we apply the developed technique to a real marketing data, and analyse dynamic response character of sales revenue to advertisement and sales promotion. In marketing study, static response of sales to some exogenous factors such as advertisement and sales promotion have been analyzed. However, if we can model daynamic response of sales to exogenous factor, more precise strategies of the sales can be designed in marketing.

View full abstract

Download PDF (257K)
Mining DAG Patterns from MicroArray Data to Discover Gene Interaction Networks

Alexandre TERMIER, Yoshinori TAMADA, Seiya IMOTO, Takashi WASHIO, Tomo ...

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 07-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_07

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

We present in this article a new method to extract frequent patterns from gene networks. The particularity of this method is to be able to extract embedded sub-DAGs from the data, whereas previous methods were limited to extracting induced sub-DAGs. Our algorithm builds up upon our Dryade closed frequent embedded attribute sub-tree mining algorithm, and by postprocessing its outputs discovers closed frequent embedded attribute sub-DAGs with one root in the data. We have tested our method on real gene networks data, and confirmed the existence of specific embedded sub-DAGs, that could not be found with previous algorithms limited to extracting induced sub-DAGs.

View full abstract

Download PDF (151K)
An experimental study of metadata extraction using named entity algorithm

Keiko SHIMAZU, Isao SAITOH, Tatsuya ARISAWA, Saori YOSHINAGA, and KOIC ...

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 08-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_08

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

In this paper, we notice the importance of proper noun extraction techniques developed in text mining community and apply it to realize a sophisticated text retrieval engine. More concretely, we extract proper nouns from the target contents by applying those techniques and put them as meta data to the corresponding documents together with their categories. Furthermore, we provide selected meta data as added keywords at the retrieval session to reduce the number of documents retrieved. Finally we conduct experimental studies to prove feasibility of our approach to realize effective contents retrieval.

View full abstract

Download PDF (408K)
Statistical Learning with Reproducing Kernel Hilbert Spaces

Kenji FUKUMIZU

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 09-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_09

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

This paper reviews the recent approaches of "kernel method" as a transform of data into the reproducing kernels.

View full abstract

Download PDF (129K)
Nonparametric Bayes and its Application to Data Mining

Naonori UEDA

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages 10-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_10

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Show abstractHide abstract

This lecture reviews nonparametric Bayesian approach for data partitioning for complex data analysis. The Bayesian modeling gives us a principled approach for clustering a set of complex data into an unknown number of disjoint or overlapped data each of which can be represented by some simple distribution. The nonparametric Bayes, that is Dirichlet process mixture (DPM) models enables us to define distributions over the countably infinite sets that faces with the partitioning problems. Infinite Relational Model (IRM) based on DPM is also presented as a real application of DPM to relational data mining.

View full abstract

Download PDF (68K)
[title in Japanese]

[in Japanese]

Article type: SIG paper
2006 Volume 2006 Issue DMSM-A601 Pages c01-
Published: July 11, 2006
Released on J-STAGE: August 28, 2021

DOIhttps://doi.org/10.11517/jsaisigtwo.2006.DMSM-A601_c01

RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

Download PDF (228K)

Register with J-STAGE for free!