Abstract
We propose two methods for the recognition of unknown strings in dictionary-based natural language processing systems. One method is for the dynamic use of statistical information during processing, and the other is for obtaining meaningful strings which should be added to the dictionary. Both methods are based on statistical information drawn from a training corpus, and there is no need for part-of-speech tagging or other preprocessing of the training corpus. We applied our methods to a Japanese morphological analysis system and had good results in reduction of unknown words and over segmentation.