An incident report system is widely used to prevent medical accidents in hospitals. An incident report is the document described by nurses for an occurrence that might lead to medical accident during the working time. It is expected to prevent medical accidents by analyzing incident reports. However, the analysis for incident reports has been statistically done by only using their metadata, e.g. occurred time, category of occurrence, skill of staff and so on. Though this statistical analysis gives us the tendencies or classification, it has lost the most important information written in the text parts. This paper proposes a new knowledge extraction method from the text parts of incident reports using metadata and co-occurrence information in the text data. The proposed method can generate a keyword graph with hierarchical architecture, and one of the features of this method is that we can actively analyze what we want to know. This paper applies the proposed method to actual incident reports, and it shows the effectiveness of this method.
Extraction and visualization of Web users' interests from their log data are challenging research topics of Web usage mining. Users' Web-watching behaviors can be represented as a graph if we regard Web sites and search keywords as nodes and their time sequence as edges. We call this graph as a site-keyword graph. This paper describes a method for extracting subgraphs of users' interests from a site-keyword graph obtained from Web audience measurement data. The subgraphs are visualized in order to assist manual analysis. Our method succeeds in extracting subgraphs composed of about 30 percent nodes of original site-keyword graph. The qualities of the subgraphs are evaluated using the number of top-ranking nodes of PageRank ranking algorithm.
A system for extracting and visualizing trend information about earthquakes from tagged corpus, which is distributed by a Workshop on Multimodal Summarization for Trend Information (MuST), is proposed. The topic of earthquakes does not contain only temporal trends, which are main concerns of typical topics such as gas price and stock price movement, but also spatial trends including the position of the seismic focus and the spatial distribution of its intensities. The proposed system employs the map of Japan for visualizing spatial trends, as well as bar and line graphs for temporal trends. The system also focuses on swarm earthquakes, which are visualized with the combination of the map and bar-graph representations. Furthermore, the system provides users with an interactive facility so that they can obtain several visualization results with intuitive operations such as mouse-click on the map. A prototype system is implemented, based on which the accuracy of trend information extraction from tagged corpus is evaluated, and its functionality is compared with existing earthquake database system. The merit of extracting earthquake trend information from newspaper articles is also considered.
The goal of our research is to develop a multimodal summarization system that provides trend information composed by multiple modalities (e.g., texts and visual expressions) based on user interest. For that purpose, this paper proposes a method of statistical chart generation from multiple documents. Since sucsource documents often lack a sufficient number of explicit expressions (e.g., “$15”) to correctly draw statistical charts, a system for generating such a chart must appropriately compensate for such missing information. To satisfy this requirement, our proposed method utilizes background knowledge and two types of expressions: comparison expressions (e.g., “increase by 30%”) and qualitative expressions (e.g., “'stable” and “increase slightly”), along with explicit expressions. Our preliminary study reveals that (1) information for drawing a chart can be doubled or more by using background knowledge and comparison expressions and that (2) user comprehension of a generated chart was improved by using qualitative expressions.
In this paper, we discuss effectiveness of “relative expressions” for trend information extraction. Relative expressions, for example, “12% increase”, “last year”, “the first place” and so forth, show relative difference and variation of numerical value implicitly. To extract more trend information from newspaper articles, one of possible means would be to utilize relative expressions. We investigated functions and statistical trend of the relative expressions in newspaper articles. And rules to extract basic elements were generated for trend information extraction. To evaluate effectiveness of the extraction rules, we experimented on newspaper articles. The result shows 0.8 or more F-measure. It was confirmed that relative expression-based extraction should well-performed.
A Web-on-line browsing system was already developed which had uploaded the contents; administrative examples or the relevant considerations of education, research, management and so on as university facility. It can be used as the activity of, so to speak, FD (Faculty Development) and SD (Staff Development). If the system is to have a lot of contents in the future, the user will have to retrieve the necessary one among the list of the contents. Aiming the easier retrieval, this research visualizes the points that the contents are dealing with thus we allocate to the vertice of a triangular pyramid, the university facilities of education, research, society and management. We here propose a method of characterization by visualizing the contents with a triangular pyramid. And also we evaluate the easiness of its visualization through an experiment.
At the presentation of his/her research results, it is necessary and indispensable for a researcher to prepare presentation slides for making audience understood his/her research results within limited time. However, making slides requires a lot of time and much effort. Therefore, many researchers hope to prepare slides more efficiently. In this research, we propose a method for automatically generating slides from a LATEX manuscript of a paper, which aims at reduction of researchers' workload. Our method analyzes a LATEX file of a paper and allocates content contained in the paper to slides and generates itemization. In the analysis of a LATEX file, our method uses only necessary information for slide generation and deletes unnecessary information. Here, our method determines necessary information for slide generation by using a typical structure of the LATEX file. In the allocation of content to slides, our method calculates the score of nouns based on frequency, entropy and idf score and extracts importance sentences contained in the paper by using the score and allocates the extracted important sentences to the slides. In generation of itemization, our method generates itemization by using the conjunction that signifies a parallel relation. The reason why our method generates itemization by using the conjunction is as follows: A sentence corresponding to a sentence including the conjunction that signifies a parallel relation may be contained in the paper. Hence, our method generates itemization that consists of the sentence including the conjunction and the sentence corresponding to it. We evaluated our method, and the experimental results showed that our method turned out to be e?ective for generating slides from a LATEX file of a paper.
Now we have to manage huge information in creative activities. Since all of information are not useful for us, we need to search useful information actively from all of information. However it is difficult to check carefully all information without evaluation priority. We supposed that users are willing to evaluate information with priorities based on user's liking. In this paper, we propose a keyword arranging interface for user's active thinking. A user arranges keywords in order of the user's liking. This interface outputs a list on which keywords are written in order of user's liking. We plan to support user's active thinking by showing the keyword list. Some experimental results proved that the interface could output the keyword list that were reflected user's liking and support user's active thinking.
A bankruptcy prediction is more important than a changes in the stock price for investors because they will have a big loss if the investing company goes bankrupt. There are some methods using discriminant analysis such as Altman's Z-score for the bankruptcy prediction. However, these methods have a problem that their precisions decrease when there is a big difference between the number of the bankruptcy companies and the existing companies. It is also difficult for them to predict the black-ink bankruptcy in recent years because they adopt general management indices. This paper describes a new bankruptcy prediction system available even in such a case by adopting “Cash Flows” which must be more important indices than the profit, sales and so on in bankruptcy. First, the system selects cash flow indices by the factor analysis to treat the money changes. Second, the system makes SOM map and predict bankruptcies by evaluating not individual company indices but interrelationships between companies indices by SOM. Third, prediction errors and their accuracies are discussed. Finally, we compare result of our method with those of the discriminant analyses including Altman's Z-score and show usefulness of our method.
In this paper, we get results for an inference of fuzzy quantified and truth qualified-especially qualified by false-natural language propositions and discuss symmetry of the results. We get the inference results Q'prime;s for the inferences QA are F is false↔Q'A are mF is false, and QA are F is false→Q' (mA) are F is false (Q, Q' : fuzzy quantifiers, A : fuzzy subject, F : fuzzy predicate, m: modifier). Further we discuss symmetry of the inference result between that of Q is most, few and m is very, more or less.
Collaborative filtering method consists of two steps. One is detection of similar users, and the other is selection of recommended items through the detected users. In this paper, we propose a method of collaborative filtering that considers the date of item selection by the relevant users. The method detects the relevant users through the similarity of users in a fixed period. Furthermore, recommended items are determined from the sets selected in another fixed period. Influence on recommendation accuracy is examined by changing the said two periods. The ringing tone download history data by about 250 thousand users in 11 months are employed for the examination. The influence is evaluated using recall by changing the periods for similar users detection and recommended items selection. As the results, it was shown that the influence of the period for similar users is low, and that of the period for items selection is not low.