In this paper, we proposed and evaluated a novel method to find mistypes in documents based on Bayesian theory. We considered that the characters entered just before mistyping are factors causing input errors. First, log data of key input is acquired for each user. Next, we obtain only the characters entered just before mistyped characters from the log data then analyze them. Finally, with these parameters, using the Bayes' theorem formula, the probability of mistaking the character immediately after the character which becomes the factor of mistype is calculated for each character. Under the cooperation of students, we confirmed there is a habit of keyboard input that cause mistype for each user, which we supposed for our method. Using documents from students that were made by themselves, we verified our method. Comparing characters that were placed just before mistypes found by eyes in a document and the characters with high probabilities that were calculated for that student by our method produced match rate of 93% in a realistic case.
View full abstract