Abstract
This paper takes up the problem of tokenization and part-of-speech tagging of segmented and non-segmented languages, and proposes a simple framework that enables an efficient and uniform treatment of tokenization for both types of languages. We also reports a language independent morphological analysis system based on the proposed idea, and shows running systems for three different languages, English, Japanese and Chinese.