Journal of Natural Language Processing
Online ISSN : 2185-8314
Print ISSN : 1340-7619
ISSN-L : 1340-7619
Report
An Automatic Occupation and Industry Coding System in Sociology
Kazuko TakahashiHirofumi TakiShunsuke TanabeLi Wei
Author information
JOURNAL FREE ACCESS

2017 Volume 24 Issue 1 Pages 135-170

Details
Abstract

In sociology, occupation and industry variables are as important as sexual and age variables. For the purpose of statistical processing, answers collected from open-ended questions in social surveys need to be converted into code, which requires considerable time and effort and often results in inconsistencies in large scale surveys. This work deals with occupation and industry coding. In this work, we develop an automatic system using hand-crafted rules and Support Vector Machines. Our system can assign three candidate codes to an answer and estimates the confidence level of the primary predicted code for each national/international standard code sets. The system has now been released through the website of the Center for Social Research and Data Archives. The user can get the required coding result by uploading the data file in a specific format.

Content from these authors
© 2017 The Association for Natural Language Processing
Previous article
feedback
Top