Proceedings of the symposium of Japanese Society of Computational Statistics

Comparison of Spatial Prediction in Local and Global Area(Competition 1)

Seungbae Choi, Sujung Kim, Sungho Moon

Article type: Article
Pages 1-4
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_1

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

When we analyze a spatial data, we generally conduct a global spatial analysis. But when spatial correlations are higher in local areas than global areas in analyzing spatial data, we think that local prediction performs better than global prediction in the prediction point of view. The present paper shows whether local prediction outperforms global prediction in the case of high correlation in a local area, based on the AMSE (average mean squared error) statistic. To show the usefulness of the proposed method, we perform a small simulation study and show an empirical example with the real transaction data of apartments in Korea.

View full abstract

Download PDF (337K)
Statistical Analysis for Marketing on Facebook : Text Mining and Clustering Approaches(Competition 1)

Maki Johjima, Kazunori Yamaguchi

Article type: Article
Pages 5-8
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_5

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This paper focuses on the role of the "Like" button on "Facebook Pages" and proposes on approach of analysis for increasing the number of "Likes" on a "Facebook Page." The paper uses the latent class models to analyze the relationship between the number of "Likes" and the contents of "Posts". After the latent class analysis, the average number of "Likes" according to class groupings will be compared. The data taken from Facebook focuses on the case of one company: Satisfaction Guaranteed. Results will show how the proposed approach of analysis to measure independent variable patterns is responsible for increasing the number of "Likes" on the "Facebook Page" of Satisfaction Guaranteed using the latent class analysis.

View full abstract

Download PDF (310K)
Auction Price Estimation for Used Cars by Regression Methods(Competition 1)

Yusuke Soejima, Hideo Hirose

Article type: Article
Pages 9-12
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_9

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In predicting the prices for auctions, we often use linear regression methods where the objective variable is the price. To find the estimate for price, we apply regularization methods in regression such as the ridge, lasso, and their relatives. In used car auctions, these methods provide very similar accuracy in the sense of the RMSE, the root-mean-squared error. However, we have found that the accuracy becomes higher when we use the k nearest-neighbor (k-NN) regression method with selected variables via the linear regression methods to this kind of auctions.

View full abstract

Download PDF (338K)
Analysis for a Soccer Game by Social Network Analysis(Competition 1)

Seungbae Choi, Changwan Kang, Hyongjun Choi, Byungyuk Kang

Article type: Article
Pages 13-16
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_13

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Social network analysis is a statistical analysis that analyzes social structure according to a stream of mutual information between observations. In this study, we used the results of a pass between players in a soccer game. Analysis contents are as follows : (1) Who is the team leader and how much of a role do they play and, (2) We find out the players who play an important role in the game by recognizing a lot of pass or passing between a lot of players. The purpose of this study is to generate basis data for future play strategy of the team by evaluating the role of each player within a team. In this study, social network analysis without separating position is conducted and is performed using each position (defenders and non-defenders), respectively. The results of this study are as follows. First, the available data shows, the players who performed the role of leader were Jungwoo Kim, Sungyueng Ki and Chungyong Lee players. The sub-leaders were Jeongsu Lee players. By position, in case of defender, the leader was Jeongsu Lee player. And, in case of non-defender, because all players played in the game excellently, they can each be the leader.

View full abstract

Download PDF (341K)
Similarity measure for candlestick chart variable(Competition 2)

Yoji Yamashita, Hiroshi Yadohisa

Article type: Article
Pages 17-20
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_17

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

A candlestick chart is generally used by investors while analyzing portfolios. It consists of daily or weekly opening price, high price, low price and closing price, and investors use candlestick chart on the basis of industrial categories. However, since such industrial categories are not based on share prices, at times, price movement of one candlestick chart can differ from that of another candlestick chart even if both candlestick charts are about the same industrial category. Therefore, categories should be created according to share prices as well. One such study in which brands are classified by share prices proposes to use closing price (Wittman, 2002). However, a method that uses only closing price lacks other trade information. Thus, in this study, we propose a method of brand classification that uses all four trading information: opening price, high price, low price and closing price. As examples, we evaluate similarity by using artificial data.

View full abstract

Download PDF (297K)
Modeling symbolic candle chart time series(Competition 2)

Heewon Park, Fumitake Sakaori

Article type: Article
Pages 21-24
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_21

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study introduces a new type of symbolic data namely a candle chart valued time series, and presents a new approaches of candle chart valued time series for forecasting of stock index direction (i.e. up and down) based on future candle chart form. From the approaches for interval valued time series, we propose forecasting methods for the candle chart valued time series based on a combination of two mid-point and two half range between the highest index and the lowest index, and between open index and close index. Also we propose new sum of squares for candle chart valued time series. To evaluate proposed methods, we describe forecasting result of real data set consisting of Asian major 5 countries' stock market indexes. The forecasting results show that the new approaches and sum of square which are based on approach of interval valued time series outperform than others in forecasting candle chart.

View full abstract

Download PDF (299K)
Simultaneous data-fitting factor analysis and k-means clustering(Competition 2)

Hironori Satomura

Article type: Article
Pages 25-28
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_25

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

A method for simultaneously performing exploratory factor analysis and k-means clustering is proposed. This is achived as an extension of factor analysis model with fixed both common and specific factors. In our strategy, it is avoided that tandem anaylsis problem which makes it impossible for us to interprete the effect of cluster structure. An efficient alternagting least squares algorithm is developed. To illustrate the usefullness, some numerical analysis are conducted.

View full abstract

Download PDF (285K)
A comparative study of classification methods for metabolomics data(Competition 2)

DongHyuk Lee, Dongho Lee, Jae Won Lee

Article type: Article
Pages 29-32
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_29

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The purpose of classification for metabolomics data is finding a subset of metabolites called marker candidate which can separate groups efficiently as well as discriminating the groups. We evaluate and compare 5 classification methods on 26 real datasets, and provide the guidelines for finding marker candidate from appropriate classification method. Although this study shows that the predictive accuracies from 5 methods are sufficiently higher (more than 90%) in 19 cases among 26 datasets, PLSDA and SDA give better performance than other methods from the aspects of classification accuracy and metabolites selection.

View full abstract

Download PDF (354K)
Ranking of each normal distribution in an approximated mixture of normal distributions of search word's frequency in VOD lecture(Competition 2)

Noboru Koyama, Hiromitsu Shiina, Kikuo Yanagi

Article type: Article
Pages 33-36
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_33

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

A search system for VOD lectures is useful if it beyond the search of only text. To facilitate better searching for movie segments of VOD lectures with Japanese subtitles, we propose a method of using subtitles and a solving maximum likelihood detection from a mixture of normal distributions. The detection is performed by a statistical method by using the EM algorithm. This allows to can estimate parameters of each normal distribution and the number of their compositions. In addition to improving evaluation of movie segments, in order to provide movie segment rankings, we evaluate ranking of each normal distribution in an approximated mixture of normal distributions of a search word's frequency. Rankings is computed by distances of between a mixture of normal distribution and removed distribution which removed one normal distribution from a mixture of normal distributions.

View full abstract

Download PDF (424K)
New release of S-PLUS V8.2 for 64bit Windows OS with ability to analysis very large scale data on your desktop PC(Demo-Session)

Tsukasa Tazawa

Article type: Article
Pages 37-38
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_37

CONFERENCE PROCEEDINGS FREE ACCESS

Download PDF (129K)
Randomization, Quantification and Visualization(Keynote)

Myung-Hoe Huh

Article type: Article
Pages 39-46
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_39

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

I will present three topics of my research interests : Randomization, Quantification and Visualization. First, I report the lack-of-randomness in shuffling of HwaTu or Hanafuda cards (Huh and Lee, 2010). Second, I write a multidimensional scaling procedure for asymmetric distance matrices (Huh and Lee, 2011). Lastly, nonparametric classifiers produced by support vector machine are visualized in reduced dimensions (Huh and Park, 2010).

View full abstract

Download PDF (416K)
Visualizing Spatio-temporal Small Area Data of Suicide in Japan(Session 1a)

Takafumi Kubota, Makoto Tomita, Fumio Ishioka, Toshiharu Fujita

Article type: Article
Pages 47-50
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_47

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this study we use spatio-temporal small area data of suicide in Japan. Especially, we focused on a municipality unit that is a political unit, such as a city, ward, town, or village, incorporated for local self-government. We used line chart of time series of suicide rates and Choropleth map of suicide rates to detect temporal trend and spatial transition from these graphs. Furthermore, in order to reduce difficulties of parameter selection and detecting connections between two graphs, we developed a system to visualize the spatio-temporal small area data of suicide in Japan.

View full abstract

Download PDF (387K)
Clustering on Mouse Ultrasonic Vocalization Data(Session 1a)

Xiaoling Dou, Shingo Shirahata, Hiroki Sugimoto, Tsuyoshi Koide

Article type: Article
Pages 51-54
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_51

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

It has been found that male mice emit ultrasonic vocalizations (USVs) towards females during male-female interaction. The purpose of this paper is to classify the waveforms of the mouse USV data. The data are transformed by FFT (Fast Fourier Transformation). Because the USV data are very noisy, it is impossible to analyze them by existing software. We first smooth the USV waveforms from the noisy data by a moving average method, and then fit them with a polynomial regression. After that, we classify the obtained USV curves by a functional clustering method. This analysis also can help us to find a rule (or grammar) of the USVs in communication between mice.

View full abstract

Download PDF (270K)
An e-Learning Course for Multivariate Analysis : The Case of Rikkyo University(Session 1a)

Yusuke Kanazawa, Ushio Tanaka, Kazunori Yamaguchi

Article type: Article
Pages 55-56
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_55

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Center for Statistics and Information in Rikkyo University has developed an e-learning course for multivariate analysis. This course is designed for students in arts departments and has two features. First, contents in this course are based on examples of analysis of real data, rather than mathematical aspects. Second, this course has some devices to learn multivariate analysis with interactive materials. These two features enables students to learn multivariate analysis, without a struggle for mathematical aspects.

View full abstract

Download PDF (202K)
Statistical Gait Analysis Based on the Microwave Doppler Sensor Data(Session 1a)

Keisuke Fukumoto, Kosuke Okusa, Toshinari Kamakura

Article type: Article
Pages 57-60
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_57

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

A study on the human gait is important in the fields of the biometrics study and the sports/health managements for planning optimal trainings. Gait analysis is mainly based on motion capture system and video data. However, from the standpoint of gait recognition, motion capture system is distant idea for biometrics. Video camera based approach is the realistic way. However, video camera is highly visible in the monitoring environment. If the subjects notice the camera system, subjects may change the behavior. In this study, we focusing on the doppler sensor based gait recognition. The purpose of this study is human gait modeling and parameter estimation based on the doppler sensor system.

View full abstract

Download PDF (311K)
Statistical Heartbeat Pace Estimation Based on the Microwave Doppler sensor Data(Session 1a)

Shuhei Inui, Kosuke Okusa, Toshinari Kamakura

Article type: Article
Pages 61-64
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_61

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In recent years, the monitoring system for the elderly are increasing an interest because of the aging society. The non-contact sensors are attracted an attention because the system requires that daily life of user doesn't interfere. Many of these sensors (e.g. infrared sensor, sound sensor, Doppler sensor) has used in the system, especially a microwave Doppler sensor has the advantage against the noise, light and temperature than another sensors. In this paper, as perspective of monitoring for the elderly, we are focusing on the detection of heartbeat and respiration because human state of life or death is finally able to judge. As initial stage of the system, this paper proposes the detection method about the component of respiration and heartbeat under the low-disturbance environment using the microwave Doppler sonsor.

View full abstract

Download PDF (351K)
Interval prediction of 3D body shapes by semantic values using regression model(Session 1a)

Sayaka Imai, Fumitake Sakaori

Article type: Article
Pages 65-66
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_65

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recently, human body modeling or human pose modeling is a hot topic in many studies. Several statistical methods have been proposed for biometrical analysis and computer graphics, and few works for apparel. In this study, we propose a statistical method which reconstructs human body shapes from height or various semantic values, e.g., height, waist-girth and chest girth, and takes dispersion of human body shapes into account using principal component analysis and regression model.

View full abstract

Download PDF (158K)
The Site Selection of Landfill Using Fuzzy Set and AHP Theory(Session 1b)

Myungjin Na, Koji Kurihara

Article type: Article
Pages 67-68
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_67

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Selecting landfill site is an important component of waste management process. Inappropriate selection of a site can engender environmental damage, economic inefficiency, and social and political conflict. These concerns indicate that environmental, economic, and social factors should be considered simultaneously when selecting the landfill sites. The site selection of landfill is a complex and multicriteria decision making process, which requires evaluation of several factors where many different attributes are taken into account. The purpose of this study is to examine a decision making process for site selection. First, we identified potential sites through preliminary screening based on exclusionary criteria. Secondly, data layers were created by collecting data and estimating spatial distribution for environmental, economic and social factors. Finally, after evaluating them based on siting criteria, the data layers were combined by fuzzy gamma operator to select candidate sites. Fuzzy analytic hierarchy process (FAHP) was also used to make pairwise comparisons and assign weights to decision criteria.

View full abstract

Download PDF (220K)
Affine Transformation and Differential Equation to Solve Optimal Moving Directions Across the Sea(Session 1b)

Kazuhito Watanabe, Yoshiaki Ogami, Yuan-Tsung Chang

Article type: Article
Pages 69-72
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_69

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

When a ship moves from point A across the sea to point B, the moving direction is affected by the victor of tidal. It is usually to use the dynamic program to Find the optimal angles for ship when moving, but it needs many point data (vectors) of tidals from point A to point B and solves it sequentially. Here We use the affine transformation to transform the vectors of tidals at point A and B to position-coordination and use the differentials equation to sequentially solve the optimal angles for ship when shipping.

View full abstract

Download PDF (263K)
A comparative study of Aircrafts scheduling algorithms based on a simulation approach(Session 1b)

Ahmed Thanyan AL-Sultan, Fumio Ishioka, Koji Kurihara

Article type: Article
Pages 73-76
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_73

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

We consider the over-constrained airport gate assignment problem where the number of flights exceeds the number of available gates, and where the objectives are to minimize the number of ungated flights and the total walking distance or connection times. We will use greedy algorithm to solve the problem and compare it with other scheduling method. Actual and forecasted data will be simulated in the experiment. The greedy algorithm minimizes ungated flights while providing initial feasible solutions that allow flexibility in seeking good solutions.

View full abstract

Download PDF (392K)
GGH(1) improves visceral fat mass in high fat diet induced obese mice(Session 1b)

Heeyoung Lee, Hyerim Lee, Michung Yoon, Seungbae Choi, Soonshik Shin

Article type: Article
Pages 77-80
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_77

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This study was undertaken to verify the effects of GGH(1) on obesity using high fat diet induced male mice. Eight-week old C57BL/6N mice were used for all experiments. Standard chow diet fed mice were used as lean control and high fat diet induced obese mice were randomly divided into 4 groups : obese control, GGH(1)-125mg/kg, GGH(1)-250mg/kg and GGH(1)-500mg/kg. After mice were treated with oral administration for 8 weeks, body weight, feeding efficiency ratio, plasma triglyceride level and visceral adipose tissue weights were measured. Compared with obese controls, GGH(1)-125mg/kg, GGH(1)-250mg/kg and GGH(1)-500mg/kg treated mice had significantly lower body weight gain and feeding efficiency ratio. Consistent with the effects on body weight gain, GGH(1)-125mg/kg, GGH(1)-250mg/kg and GGH(1)-500mg/kg decreased the weights of visceral adipose tissues. GGH(1)-125mg/kg, GGH(1)-250mg/kg and GGH(1)-500mg/kg significantly decreased plasma levels of triglyceride. Consistent with the effects on feeding efficiency ratio, GGH(1)-125mg/kg, GGH(1)-250mg/kg and GGH(1)-500mg/kg decreased plasma leptin concentrations. Plasma AST and ALT were in the physiological range and organs were not different following GGH(1) treatment compared with obese controls, indicating that GGH(1) does not show any toxic effects on liver. These results suggest that GGH(1) reduces obesity by regulating appetite and visceral lipid metabolism in C57BL/6N mice. Of the 3 GGH(1) concentrations, GGH(1)-500mg/kg seems to be most effective in improving obesity and visceral lipid disorders.

View full abstract

Download PDF (340K)
Retail Trade Area Analysis Using the Huff Model in Okayama Prefecture(Session 1b)

Kaoru Fueda, Kouichi Sugimoto, Yumiko Maeda, Masao Ueki

Article type: Article
Pages 81-82
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_81

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this paper, we analyze the report "Living activities area in Okayama Prefecture" per town. In this analysis, we evaluate the importance of distance for consumer by the distance decay parameter in Huff model. We not only validate well known trend such that consumer think heavily of distance for convenience goods such as grocery and think lightly of distance for leisure, also find recently tend to think of heavily distance for all products. Furthermore, interesting differences were observed between urban and rural, and between households which have and do not have cars.

View full abstract

Download PDF (152K)
Activities of University Hospital Clinical Trial Alliance (UHCT Alliance) for Global and Domestic Clinical Trials(Session 1b)

Takatoshi Sato, Makoto Tomita, Toshikazu Goto

Article type: Article
Pages 83-84
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_83

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The drug-approval process in Japan is far behind and the number of the approved drugs is far less than in other countries in the world. This lag has made it difficult for Japanese doctors to catch up with the global standards for the treatment of various diseases. This problem should be solved immediately. To overcome the drug lag and have a new drug approved without delay in Japan, it is mandatory to join multi-national clinical trials and apply for approval of a drug using the results. Therefore, several university hospitals with excellent performance in clinical trials have established UHCT Alliance to improve the trial environment, especially for multi-national trials, and to implement them more efficiently and safely.

View full abstract

Download PDF (258K)
Pitching Prediction By Multinomial Logit Model In Nippon Professional Baseball(Session 2a)

Keisuke Koike, Hiroyasu Abe, Hiroshi Yadohisa

Article type: Article
Pages 85-88
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_85

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In Nippon Professional Baseball, it is important for players to receive useful information. It has been reported that pitchers's records were better this year than last year. On the other hand, batters's records have worsened overall for the same period. Therefore, we created a pitching model by multinomial logit model to provide useful information to batters. First, we justify the use of the multinomial logit model and explain relevant terms used in the model. Second, we define a pitching prediction model by employing multinomial logit model and variables used for this paper after which we describe the applied data. Finally, we present the conclusions.

View full abstract

Download PDF (291K)
Characterization and comparison of Japan professional football clubs based on attack patterns(Session 2a)

Ryo Ishii, Michiharu Kitano, Hiroshi Yadohisa

Article type: Article
Pages 89-90
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_89

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In recent years, studies using football data have proliferated. Most existing studies focus on the results of matches. In contrast, few studies have considered each event that is recorded successively during the matches. Therefore, in this paper, we characterize and compare football clubs by considering such information. We characterize the ball's movement among players or different field areas into important event such as shoot and track the ball's attack patterns. These important events are collectively defined as attack patterns. We analyze the attack patterns using social network analysis and build digraphs comprising players or areas in order to characterize and compare clubs.

View full abstract

Download PDF (218K)
Modeling transition of winning percentage in sports using state space models(Session 2a)

Keisuke Yanagisawa, Fumitake Sakaori

Article type: Article
Pages 91-92
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_91

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In various sports such as baseball, football and volleyball, league systems are organized with the same teams, and the teams compete against each other every season. In this article, we consider the modeling of winning percentage using state space model. We apply the models above to the data of the Central League in Nippon Professional Baseball (NPB) for the period 1950-2004.

View full abstract

Download PDF (155K)
Tuning parameter selection for L_1 type regularization(Session 2b)

Kei Hirose, Shohei Tateishi, Sadanori Konishi

Article type: Article
Pages 93-96
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_93

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In sparse regression modeling via regularization such as the lasso, elastic net and bridge regression, it is important to select appropriate values of tuning parameters including regularization parameters. The choice of tuning parameters can be viewed as a model selection and evaluation problem. Mallows' C_p type criterion may be used to choose the tuning parameters, for which the concept of degrees of freedom plays a key role. In the present paper, we propose an efficient algorithm which computes the degrees of freedom sequentially by extending the generalized path seeking algorithm. Monte Carlo simulations demonstrate that our methodology performs well in various situations.

View full abstract

Download PDF (362K)
Detection of mislabeled training data in pattern recognition with influence function(Session 2b)

Kuniyoshi Hayashi, Hiroshi Suito, Koji Kurihara

Article type: Article
Pages 97-100
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_97

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Sensitivity analysis based on influence functions has been widely studied in the field of statistics. In particular the evaluation approach has been applied to different statistical methods such as principal component analysis, correspondence analysis, and linear discriminant analysis. However, the study of discriminant methods in pattern recognition is less advanced. With this background, we focused on a subspace method, which is a discriminant method in pattern recognition, and proposed an evaluation method for the influence of training samples to the result of analysis using influence functions. However, the performance and effectiveness of our method were not illustrated well. In this study, we focused on our single-case diagnostics and applied the approach to a representative subspace method, following which we showed good results. Specifically, in situations that had mislabeled samples in the training data, we were able to detect such samples using our approach and subsequently deleted them from the training data to enhance the performance of the target classifier.

View full abstract

Download PDF (418K)
A Clustering Method for Distribution Valued Dissimilarities(Session 2b)

Yusuke Matsui, Masahiro Mizuta, Hiroyuki Minami, Yuriko Komiya

Article type: Article
Pages 101-102
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_101

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

This paper discusses a symbolic clustering method for distribution valued dissimilarities. Symbolic Data Analysis (SDA) is a new approach for data analysis proposed by Diday in 1980s. Especially, a clustering method for symbolic data is called "Symbolic clustering". There are a lot of researches including Hierarchical clustering by Bock (2001) and Chavent & Lechecallier (2002), but there are not so many researches dealing with distribution valued dissimilarities. This paper proposes a new method for symbolic clustering using distribution valued dissimilarities.

View full abstract

Download PDF (195K)
A Simulation Study of Radiation Therapy based on LQ model(Session 3a)

Naoki Kishimoto, Yuriko Komiya, Hiroyuki Date, Hiroki Shirato, Masahir ...

Article type: Article
Pages 103-106
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_103

CONFERENCE PROCEEDINGS FREE ACCESS

Download PDF (188K)
Feature Selection for Conditional Random Field Based on the Multivariate Time Series Data with Application to the Activity Recognition(Session 3a)

Tomoo Tsujimura, Kosuke Okusa, Toshinari Kamakura

Article type: Article
Pages 107-110
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_107

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Recently, a study which recognize human activity by acceleration and angular speed sensor have been actively done. Service deployment of these studies expands medical, sport, security and variable fields. In recognizing human activity, suport vector machine(SVM) is considered as one of best learning machine in many one which is known now, because SVM has high recognition accuracy and needs less computation time. But, SVM has a defect which issues with recognition of the outlier and missing value. Therefore, we focused on Conditional Random Field (CRF) which recognizes the activity while maximizing likelihood of the interval. CRF has been often used in the field such as natural language processing, and the data used in CRF is limited to one-dimensional and categorical data. In this paper, we form an opinion of a method which transfers multidimensional time series data to the data which can be analyzed by CRF and evaluate feature selection.

View full abstract

Download PDF (329K)
Simulation Study on Tamhane and Logan's Multivariate One-sided Test(Session 3a)

Tsunehisa Imada

Article type: Article
Pages 111-114
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_111

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In this study we discuss Tamhane and Logan (2002)'s multivariate one-sided test for comparing two normal mean vectors under the assumption that the common covariance matrix is unknown. Although they specified a statistic for the test, it is difficult to derive its distribution. They derived its asymptotic distribution under the null hypothesis by using a moment matching method. Although the critical value satisfies a specified significance level approximately, it seems that the closeness of the approximation has not been investigated in detail. In this study we give numerical examples regarding the actual Type I error in various cases for Tamhane and Logan (2002)'s multivariate one-sided test intended to investigate the closeness of the approximation.

View full abstract

Download PDF (327K)
Estimation of EPMC for High-dimensional Data(Session 3b)

Masashi Hyodo, Tatsuya Kubokawa, Muni S. Srivastava

Article type: Article
Pages 115-118
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_115

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The problem of classifying a new observation vector into one of the two known groups distributed as multivariate normal with common covariance matrix is considered. In this paper, we handle the situation that the dimension, p, of the observation vectors is less than the total number, N, of observation vectors from the two groups, but both p and N tend to infinity with the same order. Since the inverse of the sample covariance matrix is close to an ill condition in this situation, it may be better to replace it with the inverse of the ridge-type estimator of the covariance matrix in the linear discriminant analysis (LDA). The resulting rule is called the ridge-type linear discriminant analysis (RLDA). The second-order expansion of the expected probability of misclassifkation (EPMC) for RLDA is derived by Kubokawa, Hyodo and Srivastava (2011), and the second-order unbiased estimator of EMPC is also given. In this study, the estimation accuracy of the second-order unbiased estimator of EPMC is investigated by using Monte Carlo simulation.

View full abstract

Download PDF (361K)
Multiple Comparisons among Mean Vectors when the Dimension is Larger than the Total Sample Size(Session 3b)

Sho Takahashi, Masashi Hyodo, Takahiro Nishiyama

Article type: Article
Pages 119-122
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_119

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

We consider multiple comparisons among mean vectors for high-dimensional data under the multivariate normality. The statistic based on Dempster trace criterion is given, and also its approximate upper percentile is derived by using Bonferroni's inequality. Finally, the accuracy of its approximate value is evaluated by Monte Carlo simulation.

View full abstract

Download PDF (225K)
Two Sample Problem for High-dimensional Data with Unequal Covariance Matrices(Session 3b)

Takahiro Nishiyama, Masashi Hyodo

Article type: Article
Pages 123-124
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_123

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

We consider a two-sample test for the mean vectors of high-dimensional data when the dimension is large compared to the sample size. In this talk, we discuss the multivariate Behrens-Fisher problem, that is, we assume that the variance-covariance matrices are not homogeneous across groups. For these situations, we propose a Dempster type test statistic. Also, we derive asymptotic null distribution and asymptotic expansion for the upper percentiles of this statistic when both the sample size and the dimension tend to infinity. Finally, we evaluate the accuracy of approximation by Monte Carlo simulation.

View full abstract

Download PDF (144K)
Sparse modifying algorithm in Bayesian lasso(Session 4a)

Ibuki HOSHINA, Sadanori KONISHI

Article type: Article
Pages 125-128
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_125

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

The lasso is simultaneous variable selection and parameter estimation procedure in linear regression models. The estimates can be interpreted as a Bayesian posterior mode when independent Laplace prior distributions are placed on the regression coefficients. Park and Casclla (2008) extended the Bayesian lasso linear regression model by placing prior distributions on hyperparameters in independent Laplace distribution. It might be however noted that the point estimate of Bayesian lasso is not sparse. In the present paper, we propose an efficient algorithm which modifies the Bayesian lasso estimates so as to be sparse. Monte Carlo simulations are conducted to investigate the efficiency of the proposed algorithm.

View full abstract

Download PDF (312K)
Bayesian Approach for Hierarchical Generalized Linear Models.(Session 4a)

Kentaro Kuroishi, Yoshimichi Ochi

Article type: Article
Pages 129-132
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_129

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

Data which have hierarchical structure are observed in many fields like a sociology, psychology and clinical trials. Hierarchical Generalized Linear Models ( HGLM ) is applied to the hierarchical data to carry out analyses taking account of the data structure. Likelihood (and approximate likelihood) approaches based on asymptotic theory are most widely used in current hierarchical analyses. One of alternative approaches is Bayesian approach. As well known Bayesian approach will be quite robust even when the target data size is small. Purpose of this research is to compare Bayesian and likelihood-based approaches for fitting of Hierarchical generalized linear model.

View full abstract

Download PDF (308K)
An enhanced active region finder method to find subsets with large treatment difference for high dimensional data(Session 4a)

Shintaro Hiro, Masahiro Mizuta

Article type: Article
Pages 133-136
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_133

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

From precision medicine point of view, it is an interesting theme to search for some subsets with large treatment difference between test drugs and placebo based on patient background information. Many methods such as classification and regression trees (CART [2]) and active region finder method (ARF [1]) can be used to find subsets impacted on response variable. However, these methods evaluate only influence on response variable and they don't look a treatment difference. Therefore, it is necessary to develop methods to find some subsets based on the treatment difference information. In addition, there is difficult common issue of course of dimensionality when a subset is identified on high dimensional explanatory variable space. In this paper, we proposed two methods. One is a revised method of ARF to search for the subsets with measuring treatment difference directly. The other one is a combination method of ARF and relative projection pursuit (RPP [4]) to find the subset with the largest treatment difference on 1-dimensional reducing space from raw high dimensional space. From the results of simulated data analysis with our methods, we showed that our methods could detect the subset with largest treatment difference as designed.

View full abstract

Download PDF (387K)
A fuzzy weight representation for ANP using sensitivity analyses of pairwise comparison.(Session 4a)

Tatsuhiko Saito, Shin-ichi Ohnishi

Article type: Article
Pages 137-140
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_137

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

ANP (Analytic Network Process) was developed from AHP (Analytic Hierarchy Process) with network structure. ANP has been used in the domain of decision making, and is useful for solving problems with network structure or dependency between elements. An eigenvector of a pairwise comparison matrix is often employed as an element of a super matrix in ANP. A sensitivity analysis for a pairwise comparison matrix is proposed, because data often lose their reliability. In other words a comparison matrix does not always have enough consistency. In this case, a fuzzy representation for weights is useful. We propose a fuzzy representation for the components of a super matrix, using the results from the sensitivity analysis. It enables us to find the composite weight of ANP as a fuzzy number when the comparison matrix does not have good consistency.

View full abstract

Download PDF (277K)
Test for Parallelism Hypothesis of Several Groups with Two-step Monotone Missing Data(Session 4b)

Mizuki Onozawa, Sho Takahashi, Takashi Seo

Article type: Article
Pages 141-142
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_141

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

We consider a parallel profile model for several groups when the data has two-step monotone missing observations. For two-step monotone missing data, Anderson and Olkin (1985) obtained the MLEs of mean vector and covariance matrix for one sample problem. By the same way as Anderson and Olkin (1985), the MLEs for two sample problem have been obtained (see, e.g., Shutoh, Hyodo and Seo (2011)). Also, profile analysis of several groups was discussed by Srivastava (1987). In this paper, we construct a test statistic for parallel hypothesis based on the likelihood ratio with two-step monotone missing data. Finally, in order to investigate the accuracy for the null distribution of the proposed statistic, we perform Monte Carlo simulation for some selected values of parameters.

View full abstract

Download PDF (154K)
On the Estimation of Kurtosis Parameter with Missing Data in Elliptical Population(Session 4b)

Rie Enomoto, Naoya Okamoto, Takashi Seo

Article type: Article
Pages 143-144
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_143

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

For missing data, EM algorithm is a parameter estimation method. Srivastava (1985) derived likelihood equations and likelihood ratio test without condition of missing patterns. Srivastava and Carter (1986) proposed the numerical solution for likelihood equations by Newton-Raphson method. We propose an estimator of kurtosis parameter for missing data without condition of missing patterns in elliptical population. In order to evaluate accuracy of the kurtosis parameter, the numerical results by Monte Carlo simulation for some selected values of parameters are presented. Then we make sure that it is better to utilize sample which includes missing data than to discard it.

View full abstract

Download PDF (155K)
Analysis of correlated binary data with missing(Session 4b)

Takayuki Abe, Yuji Sato, Manabu Iwasaki

Article type: Article
Pages 145-146
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_145

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

In clinical studies, correlated binary response data are frequently collected. Although various methods for analysis of correlated data have been proposed, the evaluation on those is not sufficient in case of binary responses with specifically various missing mechanisms. Therefore we investigated the performance of six statistical methods (last observation carried forward (LOCF), complete case analysis (CC), conventional generalized estimating equations (GEE), weighted-GEE (WGEE), multiple imputation (MI) and generalized linear mixed-effects models (GLMM)) for correlated binary response with missing. Continuous variables for defining binary responses were used to impute missing values in MI and to calculate the weights for WGEE. This evaluation used actual data from a clinical study that compared two antidepressants.

View full abstract

Download PDF (210K)
No Improper Solution Occurs in EM Factor Analysis(Session 4b)

Kohei Adachi

Article type: Article
Pages 147-148
Published: November 11, 2011
Released on J-STAGE: July 15, 2017

DOIhttps://doi.org/10.20551/jscssymo.25.0_147

CONFERENCE PROCEEDINGS FREE ACCESS

Show abstractHide abstract

EM algorithms for maximum likelihood factor analysis have been proposed by Rubin and Thayer (1982). In this paper, it is proved that that their algorithms always produce proper solutions with positive unique variances and factor correlations whose absolute values do not exceed one.

View full abstract

Download PDF (190K)

Register with J-STAGE for free!