Abstract
The purpose of this paper is to provide an overview of the considerations involved in statistically analyzing real language data. Most of the data used in linguistic research are qualitative data. When qualitative data are analyzed statistically, they need to be converted into quantitative data in an appropriate manner. In this paper, I focus on three types of qualitative data used in linguistic research: questionnaires, interviews, and linguistic corpora. Statistical processing of natural language is a procedure that transfers the analog world to the digital world. Behind the process is always the judgment and thought of the researcher. It is important to always be aware of the starting point for statistical analysis.