We are now in feasibility studies on the project for information resource sharing among many National Research Institutes on the humanity sciences in Japan. The project is collaboration with National Institute of Japanese Literature, National Museum of Ethnology, National Museum of Japanese History, International Research Center for Japanese Studies, The Historiographical Institute of University of Tokyo, Center for Southeast Asian Studies of Kyoto University, The Graduate University for Advanced Studies, and so on. It has an objective of developing system and technologies on seamless access to the individual databases held in these Institutes.
We use the Dublin Core Metadata Element Set for metadata system and the Z39.50 protocol for information retrieval system and the HTML for user interface.
We have taken the picture images of all the Buddhist canons in the possession of Kongoji, using digital cameras. Although those images are so vivid and easy to view, it is quite hard to find some character or phrase from that file. In addition, it is not effective or practical to manage tens of thousands of the images only by means of their file names. For smoothing these difficulties away, we are constructing a database system together with a prototype viewer which meets viewability and searchability. The searchability is based on the existing text files of the canons, since most canons in Kongoji correspond to those of Taisho Tripitaka whose text files are available on the Internet. Therefore we attempts to recognize the character regions of the image files automatically, and to develop the interfaces which help one relate the location of the character region to the position of the character in the text file.
We constructed digital archive of architectural material of Seitaro Tsuboi house which was built in Tokyo 1923. However, the architectural material contains many technical terms and a large number of names of people and organizations. If we did not clarify the relationship among the original, the digital archive would not make it easy to understand the relationship among the material. As a result, we developed an ontology concerning the knowledge information described in the architectural documents, and created archive based on the ontology. Application of ontology is prominent in industrial, medical, and legal fields, but there are few cases where it has been applied to historical material. Consequently, this paper explains the merits of ontology developing in the digital archiving of historical material.
METS (Metadata Encoding and Transmission Standard) is applied to constructing a digital library system of Takagi's collection in Center for Pacific and American Studies, University of Tokyo. The collection was presented by Professor Yasaka Takagi (1889-1984) of the same university, who is a pioneer of American studies in Japan, to the center for a researcher of the same field. Takagi's collection contains various types of material and media, for example a letter, a manuscript, a leaflet, proceedings etc. Currently cataloging activities for book and periodicals at library become simple and useful because of using online shared catalog system and OPAC. But there is no critical way of cataloguing for other types. METS is an XML-based standard for encoding "hub" documents for materials whose content is digital. We see the advantages of METS and analyze uses of METS to cataloging for various type materials.
We propose a new method of constructing information resource sharing system which takes into account the user's viewpoint. The main approach of previous research to share the different kind of information resources is to map them into a controlled element set, such as the Dublin Core Metadata, and to search through it as a common access points. However, this approach cannot satisfy users' information needs because only a system designer's point of view is reflected into the data structure. In order to resolve this issue, we investigate user's demands and construct a resource sharing system by combining two profiles based on user's demands and each database system's requirements.
Recently, a linking system has been introduced in the area of scholarly information. A link resolver of the system maintains links between a resource's metadata on a server and it's fulltext on other server. However, the link resolver cannot resolve user's information requirements. In this paper, we propose a resource linking system based on semantic information. Our system provides links connecting a data not only to primary information but also to various related information entities, and achieves a continuous linking process by using the results from the external database services.
In Japan, there has recently been an increasing interest in health, home care and welfare, as we enter into a society occupied by more and more elderly people and a smaller number of younger citizens. The Ministry of Health, Labor and Welfare designated an instrument for blind people and elderly people as a "daily life tool" in April 2003. This instrument translates information printed on paper to voice reading. Our company developed this instrument, called "SPEECHIO, " as well as the new software that converts text to two-dimensional symbol named "SP-CODE." The SPEECHIO reads SP-CODE and changes text into voice reading automatically. We believe this new barrier-free communication system will greatly help blind people and old-aged people who have difficulties in reading small prints. Continuously from 1st report, I describe the spread of the paper print contents with two-dimensional SP-CODE.
The growth of unsolicited bulk e-mails (spams) is a crucial problem on e-mails of the Internet. There are many anti-spam tools based on automatic classification by learning, such as Bayesian filters. They are dependent on language of e-mails because they have lexical analyzer to get words from e-mails. However, spams are written in various languages, such as English, Japanese, Chinese, and so on. This paper proposes a language independent method for filtering spams. By the method, e-mails are classified into spams and no-spams by SVM which uses frequencies of sub-strings extracted from e-mails. This paper also describes a result of test of the method with sample e-mails written in English, Japanese, Chinese, and some other languages, and discusses about the result and future works.
Most of the cases, the novels are not given the subject headings. Consequently, the novels have been classified based only on periodizations, languages or authors, but done based rarely on the subjects. In recent years, some research has been done in which the classification of the novels using kansei keywords was tried, but these research were only experimental level since the classification using kansei keywords cannot apply to a large number of novels. The purpose of this research is to examine the methods of classification using the keywords in online bookreviews which are rapidly increasing on the Webs. The result of the classification done by this method gave good agreement with the one of the classification based on kansei keywords given manually.
Clinical medicine with genomic science is defined as clinical application of genomic findings. Translational research (TR) is required to narrow the gap between clinical medicine and genomic findings. There are multi-disciplinary approaches to understanding the complex biology of whole organisms in their natural environments. These approaches rely on knowledge covered by conventional analytical techniques of life science and by bioinformatics and computer science. Sharing the whole knowledge is essential for the multi-disciplinary approaches. The purpose of TR is also testing to apply novel therapeutic strategies to humans, which are developed through experimentation. The ethics is indispensable to the operation of TR. We propose to develop research environment, that is, an intelligent information system to support TR. In this paper, a system concept of the intelligent information system is discussed in the view of the logic and the ethics.
In genomic-translational research (genomic-TR), bridging genomic science and clinical medicine is the most important issue. And also cross-disciplinary genomic-TR teams need knowledge sharing. Focusing on hematology, we tried to implement e-pathfinder prototype. This prototype system provides integrated knowledge database and enables to represent the knowledge with GUI (Graphical User Interface). In this paper, we illustrate knowledge information system "e-pathfinder" and discuss the future tasks of genomic-TR support systems.
Genomic translational research that applies genomic findings to clinical science entails an experimental phase using Human, that is, "experimental care". Such a phase should be done in the maximum of safety and efficiency, which is essential for the success of translational research. We are developing a protocol management system that supports experimental care. The system helps to manage optimized operation of experimental care by means of supporting the development of care protocols and the compliance of the protocols. This paper shows a basic knowledge representation suitable for care protocols of translational research.
The importance of web pages as information sources has grown larger as the number of pages has been increasing rapidly. Web pages, however, are not considered as a reliable information sources since they have short life and about half of them are deleted or updated in a year. But, a part of the pages considered as deleted or updated are often moved to other servers or only change their URLs. We have found that about 30 or 40% of the pages considered missing still exist and are even traceable and can be reached by analyzing the pages considered missing. We have found out 23.6% of the pages considered deleted by making the program which can trace the pages by using URL structures and keywords in title, center and selection tags.
The analysis of co-authorship in academic papers is a useful approach to objectively evaluate university-industry linkage which has been highly promoted nowadays. This kind of bibliometric research are mostly based on analyses of data extracted from the ISI's citation index databases, which are considered not to be sufficient for assessing research activities in Japan. In this study, we discuss on research network formation including university-industry research collaboration based on CJP database, produced by National Institute of Informatics.