The information that customer dataset usually provides is the personal surface information including gender, age and hometown. However, we can obtain personal internal information by analyzing questionnaire responses. In this paper, we proposed a visualization method that combined association analysis with correspondence analysis, it can find about difference of internal characteristics of six layers of gender and age.
In the presidential address at the annual meeting of the Psychometric Society, held in Banff, Canada, Nishisato (1996) delivered his view on quantification theory under the title of “Gleaning in the field of dual scaling.” He identified a number of unsolved problems at that time. It is almost twenty years since then. Are those problems satisfactorily solved? Yes and no. From the current stage of progress, this paper was written about his view on the future perspectives of quantification theory. Given that the main purpose of quantification of categorical data lies in its strong desire to analyze both row variables and column variables on the equal footing, an immediate concern is around the current practice of multidimensional joint graphical display. This paper directs our attention to the difficulty underlying our joint graphical display of both row variables and column variables, where principal coordinates of both variables need to be plotted. The solution rests in the use of doubled multidimensional space, and the current paper recommends a step forward to total information analysis and the use of cluster analysis, instead of highly problematic multidimensional joint graph, an approach that most researchers, under the strong doctrine of `correspondence analysis,' have unsuccessfully pursued. In discussion, it was stated that the current practice of symmetric, non-symmetric scaling and biplot all fail to satisfy the basic premise of joint graphical display, that is, multidimensional plot of principal coordinates of both row and column variables. The paper emphasizes the importance of the understanding of the basic objective of joint analysis of row and column variables on the equal footing.
We proposed a method for transforming solutions provided by multiple correspondence analysis (MCA) to the form of principal component analysis (PCA) to justify the exploratory factor analysis of Likert-type items, and to extend it. We began by reformulating MCA as the maximization of the sum of variances of quantified variables, defined as the sum of quantified scores for each categorical variable. Next, we obtained a PCA formulation that yielded the same quantified scores as did the MCA through orthonormalizations of quantified scores by singular value decomposition of each block of a matrix of quantification weights. Owing to entire indeterminacies under orthonormal transformations of quantified scores for each variable, we proposed a way of providing metrics to ordered categories by orthonormal polynomials. We also proposed a method for computing a component pattern matrix after rotating a matrix of weights for PCA. The method can be viewed as Harris-Kaiser's independent cluster rotation. Finally, we demonstrated the application of the proposed procedures and interpreted the output using a real data set consisting of university student responses to Likert-type items asking experiences of positive and negative emotions in academic situations.
There have been a number of researches to predict future events with information sourceson the internet, and they mostly utilize single or few information sources for prediction. Onereason of the small number of information sources is because it is not necessarily appropriateto assume that plural information sources have identical features in terms of contentsand that prohibits dealing all the information sources equally. In order to utilize variety ofinformation sources on the internet as possible, it would be essential to extract differentialfeatures of information sources, and evaluate how they relate one another. This paper assumesthat news from major Japanese newspaper publishers represent business facts, andblog and message board (hereafter ’social media’) contain evaluation of business facts. Itthen introduces analytical framework to extract differential features of each of the informationsources. Specifically, this paper hypothesizes that differential features are added in thesocial media’s news quotation onto original news, and discusses how to extract the featuresin a numerical manner. It also discusses features can be at the level of information sourcesper se as well as at more granular taxonomy level. Entropy term-weighting and Naive-Bayesclassifier are adopted to categorize contents and singular vector decomposition is adoptedto extract the differential features. These procedures are applied to the data relating toToyota motors, and result of the analysis not only supports the hypothesis but also exemplifieshow differential features at granular taxonomy level represent the company’s currentcharacteristics.