Abstract
Unlike balanced corpora, numerous learner corpora are not satisfactorily large-scale or representative. However, because previous analyses were conducted fundamentally by using the same method as balanced corpora, the present study examined the statistical validity of the existing method. According to the findings, linguistic units (e.g., misuse expressions, sentences, and morphemes) are consistently used by each learner. For this reason, there is no satisfactory assumption of independence for statistical analyses, and there are cases in which the outliers distort the results. Thus, the method of collecting each learner’s frequency and analyzing learners as observation units is a valid approach to replace the existing method.