JSAI Technical Report, Type 2 SIG
Online ISSN : 2436-5556
Volume 2006, Issue DMSM-A601
The 1st SIG-DMSM
Displaying 1-11 of 11 articles from this issue
  • Hiroki ARIMURA
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 01-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS
  • Yutaka KANO, Masashi MIYAMURA
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 02-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    Statistical methodology and its difficulty for causal inference are first reviewed in experimental, quasi-experimental or observational studies. The review includes path analysis,graphical modeling, Rubin's counterfactual models and propensity scores. It is then studied how those statistical methods can relate with causal discovery in data mining in computational science.

    Download PDF (224K)
  • Toshihiro KAMISHIMA, Shotaro AKAHO
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 03-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    Ordered lists of objects are widely used as representational forms. Such ordered objects include Web search results and best-seller lists. Techniques for processing such ordinal data are being developed, particularly methods for the supervised ordering task: i.e., learning functions used to sort objects from sample orders. In this article, we propose dimension reduction methods specifically designed to improve prediction performance in supervised ordering tasks.

    Download PDF (309K)
  • Yasutoshi YAJIMA
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 04-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    We present optimization approaches for semi-supervised learning for classification based on the formulations of Support Vector Machine (SVM) for the conventional supervised setting. We first introduce the Laplacian of a graph and the associated graph kernels which are exploited in many semi-supervised classification methods. We will show that these methods can be naturally derived from the conventional formulations of SVMs with the graph kernels. The proposed optimization problems fully enjoy the sparse structure of the graph Laplacian, which enables us to optimize the problems with a large number of data points in a practical amount of computational time. Some numerical results indicate that our approaches achieve fairly high performance on large scale problems.

    Download PDF (317K)
  • Issei SATO, Hiroshi NAKAGAWA
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 05-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    In text mining,aword frequency is an important element.Moreover,when we extract a relationship between words,it is important to extract dependecy structure in a sentence. This paper proposes asemi-structure mining method extracting frequent words with dependecy structure in alarge number of text data.Our method identifies dependency as tree structure whose node is a sequence.In this way,our proposed method can extract patternswhich the conventional method can not extract.

    Download PDF (359K)
  • Kenta FUKATA, Takashi WASHIO, Katutoshi YADA, Hiroshi MOTODA
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 06-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    An Auto-Regressive eXogenous input (ARX) model has been widely used in engineering fields to model dynamic response of a system to exogenous factors. A difficulty in this modeling is the determination of an appropriate model order for given data. In this paper, we develop a new and practical approach to determine the appropriate order. Moreover, we apply the developed technique to a real marketing data, and analyse dynamic response character of sales revenue to advertisement and sales promotion. In marketing study, static response of sales to some exogenous factors such as advertisement and sales promotion have been analyzed. However, if we can model daynamic response of sales to exogenous factor, more precise strategies of the sales can be designed in marketing.

    Download PDF (257K)
  • Alexandre TERMIER, Yoshinori TAMADA, Seiya IMOTO, Takashi WASHIO, Tomo ...
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 07-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    We present in this article a new method to extract frequent patterns from gene networks. The particularity of this method is to be able to extract embedded sub-DAGs from the data, whereas previous methods were limited to extracting induced sub-DAGs. Our algorithm builds up upon our Dryade closed frequent embedded attribute sub-tree mining algorithm, and by postprocessing its outputs discovers closed frequent embedded attribute sub-DAGs with one root in the data. We have tested our method on real gene networks data, and confirmed the existence of specific embedded sub-DAGs, that could not be found with previous algorithms limited to extracting induced sub-DAGs.

    Download PDF (151K)
  • Keiko SHIMAZU, Isao SAITOH, Tatsuya ARISAWA, Saori YOSHINAGA, and KOIC ...
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 08-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    In this paper, we notice the importance of proper noun extraction techniques developed in text mining community and apply it to realize a sophisticated text retrieval engine. More concretely, we extract proper nouns from the target contents by applying those techniques and put them as meta data to the corresponding documents together with their categories. Furthermore, we provide selected meta data as added keywords at the retrieval session to reduce the number of documents retrieved. Finally we conduct experimental studies to prove feasibility of our approach to realize effective contents retrieval.

    Download PDF (408K)
  • Kenji FUKUMIZU
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 09-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    This paper reviews the recent approaches of "kernel method" as a transform of data into the reproducing kernels.

    Download PDF (129K)
  • Naonori UEDA
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages 10-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS

    This lecture reviews nonparametric Bayesian approach for data partitioning for complex data analysis. The Bayesian modeling gives us a principled approach for clustering a set of complex data into an unknown number of disjoint or overlapped data each of which can be represented by some simple distribution. The nonparametric Bayes, that is Dirichlet process mixture (DPM) models enables us to define distributions over the countably infinite sets that faces with the partitioning problems. Infinite Relational Model (IRM) based on DPM is also presented as a real application of DPM to relational data mining.

    Download PDF (68K)
  • [in Japanese]
    Article type: SIG paper
    2006 Volume 2006 Issue DMSM-A601 Pages c01-
    Published: July 11, 2006
    Released on J-STAGE: August 28, 2021
    RESEARCH REPORT / TECHNICAL REPORT FREE ACCESS
    Download PDF (228K)
feedback
Top