Abstract
The Internet enables people to share documents written in various languages worldwide. Many documents on the Internet are provided by the WWW. Most of them are markupped with HTML tags. The tags which indicate document elements are very useful for full-text retrieval. The author considers that a full-text retrieval system for tagged multilingual documents is very important to get useful information. This article describes a multilingual full-text retrieval system for tagged documents. It has functions to store and retrieve SGML, XML, and HTML documents. The system handles character code sets both IS0-2022-JP-2 and Unicode for multilingual texts. It is developped with Java for portability. This article also discusses the performance issues of the implemented system.