Compiling Bayesian Networks (BNs) into Zero-Suppressed BDDs (ZDDs) to perform efficient exact inference has attracted much attention. Computation time for exact inference using ZDDs is reduced to linear time in the size of the ZDDs. Also, cache memory techniques further help accelerate the inference. However, as the size of BN grows, compiling ZDDs becomes unacceptable in both time consumption and ZDD size which hinders BN practical applications. In this paper, we aim to improve the conventional ZDD-based method by proposing the idea of partitioning and separately compiling BNs. For every given BN, serial pattern d-separation sets are found and used to partition the BN into conditionally independent components. Separately compiling these components into ZDDs is more efficient than generating a giant ZDD for a whole network. However, partitioning a BN into too many components may give rise to considerable time consumption which grows exponentially with the number of vertexes in serial pattern d-separations. To trade off the off-line time consumption (for finding d-separations and compiling ZDDs) and on-line time consumption (for inference using ZDDs), the d-separations used to partitioning BNs are restricted to one-vertex and found using Tarjan’s vertex-cut algorithm which can be performed linear time in the number of BN vertexes. The experiments illustrate that one-vertex d-separations exist in most BNs. Partitioning BNs with one-vertex d-separations improves the speed for both compilation and inference largely than the conventional ZDD-based method. To show the validity of partitioning with one-vertex d-separations, we also conduct the experiments of partitioning with two-vertex d-separations and the comparative experiments of jointree algorithms.
We propose a statistical model for relevance-dependent biclustering to analyze relational data. The proposed model factorizes relational data into bicluster structure with two features: (1) each object in a cluster has a relevance value, which indicates how strongly the object relates to the cluster and (2) all clusters are related to at least one dense block. These features simplify the task of understanding the meaning of each cluster because only a few highly relevant objects need to be inspected. We introduced the Relevance-Dependent Bernoulli Distribution (R-BD) as a prior for relevance-dependent binary matrices and proposed the novel Relevance-Dependent Infinite Biclustering (R-IB) model, which automatically estimates the number of clusters. Posterior inference can be performed efficiently using a collapsed Gibbs sampler because the parameters of the R-IB model can be fully marginalized out. Experimental results show that the R-IB extracts more essential bicluster structure with better computational efficiency than conventional models. We further observed that the biclustering results obtained by R-IB facilitate interpretation of the meaning of each cluster.
Tracking precisely of abnormalities in the gastrointestinal tract is useful for preparing sample image sequences on educational training for medical diagnose on endoscopy. While the gastrointestinal wall deforms continuously in an unpredictable manner, however, abnormalities without distinctive features make it difficult to track over continuous frames. To address this problem, the proposed method employs Convolutional neural networks (CNN) for tracking lesion area. Conventionally, CNN for tracking requires a large amount of sample data for preliminary learning. The state-of-arts tracking methods using CNN are premised on preliminary learning on data similar to target images given a large number of correct answer labels. On the other hand, the proposed method are not required preliminary learning using similar data. The image components in the marked region at the starting frame is similar to components at the only same position, but different between them depending on the degree of overlapped area. Furthermore, in the successive frame, the components in the previous region is similar to them in the identified area. Therefore, similarity can be learned in the previous frame, called it as an intra-frame training. This paper describes the method for tracking an abnormal region by using CNN based on training overlap rates between the abnormal region and local scanning one with the same size on the starting intra-frame. Furthermore, network parameters are transformed from training the similar regions on the continuous frame additionally. We demonstrate the efficiency of the proposed approach using eight common types of gastrointestinal abnormality.
The multi-agent-based traffic simulation is useful to evaluate traffic policies with detailed resolution. To evaluate them feasibly, not only the validity of the simulation model but also the accuracy of the input data is important. The traffic demand is one of the important input data, which is described as the set of Origin-Destination (OD) traffic volume and is estimated by OD estimation. In the OD estimation, the location of the traffic counting points plays an important role, which affects the estimation results largely, thus the traffic counting location optimization has been developed. Existing methods target capturing more information for the OD estimation, that is the location where the most OD pairs can be captured is selected. However, since they do not consider the difficulty of the estimation, the reproduction of the traffic volume in the assumed location are not always accurate. Although it is hard to evaluate the difficulty so far, thanks to the development of the estimation methods which consider stochastic properties, now the uncertainty can be used indirectly as that difficulty. In this research, we conduct the uncertainty quantification (UQ) in the OD estimation and propose a new location optimization method of the traffic count points considering UQ result.
Medical diagnostic support system is an automatic support system that prevents doctors from unknowingly mis-interpreting medical results. However, it is not an easy task to automate the procedure with high accuracy. Our goal is to construct such a medical diagnostic support system that could improve the overall accuracy of medical diagnoses. As a pilot study, we built a program that automatically answers the medical licensing examination (MLE), in our previous study. MLE involves questions that require the users to pick answers such as disease names or drug names from multiple choices, given the patient information. In our previous study, the program was developed to answer only disease related questions, but we realized that the study will not be complete without deciding optimal drug for patients. For this reason, we attempt to expand this program to answer drug related questions in the current research. The major improvements include vectorizing the words and automizing the construction of rule base. By this, we prevented the tedious task of inputting drug information manually and now it is possible to avoid influences of inconsistent spelling and synonyms by vectorization of words. We managed to increase the accuracy of the previous study up to 56.1%.