Sociological Theory and Methods
Online ISSN : 1881-6495
Print ISSN : 0913-1442
ISSN-L : 0913-1442
Volume 19, Issue 2
Displaying 1-9 of 9 articles from this issue
Special Section: Advances in the Analyses of Non-Fixed-Form data
  • Hiroshi TAROHMARU
    2004Volume 19Issue 2 Pages 131-133
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
    Download PDF (50K)
  • Noboru OHSUMI, Akio YASUDA
    2004Volume 19Issue 2 Pages 135-159
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
         The objective of this paper is to give overviews of text mining or textual data mining in Japan from the practical aspects. Firstly, we explain that text mining is defined as a branch of data mining (in particular, KDD: knowledge discovery and data mining) which is applied to large amount of text datasets. And target of text mining is to objectively discover and extract knowledge, facts, and meaningful relationships from the text documents. We will also briefly outline the related disciplines and application fields which are applied in text mining.
         In addition, we discuss the applicability of text mining in the field of qualitative research and also examine about how to solve some problems faced in using text mining techniques. Moreover, the computer programs for conducting text mining are given as the summarized tables.
         As concrete examples, using a data set of some open-ended questions obtained by Web-based survey, we illustrate several analyses of segmentation of Japanese responses to the open-ended questions, visualization of mining results, and statistically significant test based on the frequencies of characteristic words and the corresponding statistical test-values obtained from the aggregated lexical table for “words by gender-age variable” with 12 categories, generated by cross-tabulating gender (two categories) and age (6 categories).
         Finally, we propose a perspective of text mining that we expect, that is, about how to solve questions which knowledge is needed and how to be able to suitably gather the text data sets required for understanding the target phenomena. At any rate, from the point of view of data science, question about how a sort of “acquisition system” for obtaining the appropriate data sets can be integrated will have to be examined in future.
    Download PDF (632K)
  • Koichi HIGUCHI
    2004Volume 19Issue 2 Pages 161-176
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
         Since the first application of content analysis in the field of social research, newspaper articles have been an important subject of said quantitative analysis and a variety of studies have been conducted. Today, it is easier to use computers to analyze newspaper articles because their processing power has grown rapidly, and there are a lot of newspaper databases available. In addition, computer-assisted methods to analyze textual data such as newspaper articles and the like are proposed frequently. Therefore, the author has attempted to confirm and consider the two questions outlined below, by performing an experimental analysis of newspaper articles which mentioned “salaried men” in The Mainichi Shimbun and were published between 1991 and 2002. First, are there any significant differences between the results derived from a computer-assisted analysis and those derived from a classical analysis that does not make use of an automated process? Second, are the methods which are proposed to analyze general “textual data” really suitable for the analysis of newspaper articles? In other words, what are the specific advantages and disadvantages of using computers in analyzing newspaper articles as compared with other kinds of textual data?
    Download PDF (335K)
  • Automatic Occupation Coding Methods
    Kazuko TAKAHASHI, Hiroya TAKAMURA, Manabu OKUMURA
    2004Volume 19Issue 2 Pages 177-195
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
         We apply both a rule-based method and a machine learning method to the occupation coding, which is a task to categorize the answers to open-ended questions about the respondent's occupation. Specifically, we use Support Vector Machines (SVMs). Conducting the occupation coding manually is expensive and sometimes leads to inconsistent coding results when the coders are not experts of the occupation coding. For this reason, a rule-based automatic method has been developed and used. However, its categorization performance is not satisfiable. Therefore, we adopt SVMs, which show high performance in various fields, and compare it with the rule-based method. We empirically show that SVMs outperform the rule-based method in the occupation coding with JGSS(Japanese General Social Surveys) data set. These two methods can be expanded to apply to responses to open-ended questions similar to occupation data.
    Download PDF (408K)
  • Toward a Statistical Analysis of Japanese Historical Population Registers
    Hideki NAKAZATO
    2004Volume 19Issue 2 Pages 197-212
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
         The purpose of this article is to introduce the effective use of a relational database management system (RDBMS) in statistical analysis of Shumon Aratame Cho (population registers = SAC).
         First, the data structure of SACs is compared with that of the questionnaires for the Third National Survey on Household Changes. Secondly, I explain the features of the RDBMS and examine some early uses of the system in historical studies, combing data from various sources. The main body of the paper is a detailed explanation of SQL statements to obtain the variables necessary for estimating the proportion and transition rates of coresidence of people with their children, using the SAC database. I also suggest the potential of RDBMS for achieving effective use of questionnaires in sociological research.
    Download PDF (225K)
  • Possibility of Optimal Matching Analysis
    Tsutomu WATANABE
    2004Volume 19Issue 2 Pages 213-234
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
         In this paper, I would like to analyze job career data and examine the merit and demerit of the Optimal Matching Analysis that is one of the sequence analyses in which a lot of sociologists have been interested. There are few papers about the career pattern except Hara (1979) and Seiyama (1988) in Japan. I examine the job career dataset of SSM (Social Stratification and Mobility) survey in 1995 using the Optimal Matching Analysis. First, I analyze the ten years and the thirty years career lines dataset using the Optimal Matching Analysis and calculate the paired distances between career lines. I then classified these career lines by the clustering techniques. I find that the career lines cluster into six general types about each of ten years and thirty years dataset. Moreover, I examine the relation between the first job, the job after 30 years, the education, and the career pattern using Qualitative Comparative Analysis (QCA). On the basis of these analyses, I can conclude that the Optimal Matching Analysis is a new method that can make the patterning of the job career.
    Download PDF (344K)
Articles
  • Multi-dimensional Scaling Analyses of Cognitive Maps of Nations
    Shunsuke TANABE
    2004Volume 19Issue 2 Pages 235-249
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
         This paper investigates how Japanese people see other nations and peoples of foreign countries by representing their perceptions on cognitive maps. I collected several types of data from university students: 1. Judged similarity among 26 nations/peoples (pile sort), 2. Preference data on 10 nations/peoples (paired comparisons), and 3. Exposure to foreign nations/people such as experiences of foreign travel and having foreign friends.
         By analyzing the judged similarity data using Multidimensional Scaling, it was found that the respondents categorized nations/peoples based on three dimensions: “Western vs. non-Western (White vs. non-White),” “psychological distance,” and “geography.” Those who have experienced traveling aboard had a tendency to place less weight on the “Western vs. non-Western (White vs. non-White)” dimension in their cognition. Preference data showed that some preferred Asian people, while others favored Western people; however, almost every respondent revealed his/her unfamiliarity with Islamic and African people.
    Download PDF (142K)
  • An Application of Binary Logistic Regression Model
    Nobuo KANOMATA
    2004Volume 19Issue 2 Pages 251-264
    Published: September 30, 2004
    Released on J-STAGE: December 22, 2008
    JOURNAL FREE ACCESS
         This paper presents the method which uses binary logistic regression model to estimate the degree and its changes of unequal opportunity in intergenerational mobility. In this model, the time variables, the dummy variable of father‘s class which denotes the same category to the dependent variable on son‘s class, and the interaction variables are employed as independent variables. The parameter estimated by this model parallels to observed log odds ratio in a mobility table and reflects changes of unequal opportunity caused by time period, birth cohort and aging. The results of analysis applied to SSM survey data show that unequal opportunity in three of six classes decreased with aging, and in one class diminished with surveyed time points, while other two classes had no temporal change. The index calculated from the parameters, to measure total inequality which a society has as a whole, suggests that mobility in Japan was equalized drastically from 1955 to 1965, and gradually with transition of cohorts since 1965.
    Download PDF (216K)
Book Reviews
feedback
Top