Recently the content distribution networks (CDNs) have been highlighted as the new network paradigm which can improve latency for Web access. In CDNs, the content location strategy and request routing techniques are important technical issues. Both of them should be used in an integrated manner in general, but CDN performance applying both these technologies has not been evaluated in detail. In this paper, we investigate the effect of integration of these techniques. For request routing, we focus on a request routing technique applied active network technology, Active Anycast, which improves both network delay and server processing delay. For content distribution technology, we propose a new strategy, Popularity-Probability, whose aim corresponds with that of Active Anycast. Performance evaluation results show that integration of Active Anycast and Popularity-Probability can hold stable delay characteristics.
Information protection schemes on mobile phones become important challenges because mobile phones have many types of private information. In general, user authentication and anomaly detection are effective to prevent attacks by illegal users. However, the user authentication can be applied only at the beginning of use and the conventional anomaly detection is not suited for mobile phones, only but computer systems. In this paper, we propose a simple and easy-to-use anomaly detection scheme on mobile phones. The scheme records the keystrokes as the mobile phone is operated, and an anomaly detection algorithm calculates a score of similarity, to detect illegal users. We implemented a prototype system on the BREW (Binary Run-time Environment for Wireless) emulator and evaluated error rates by using results from 15 testers. From experiments results, we show the proposed scheme is able to apply the anomaly detection by checking the similarity score several times.
In an ad hoc network, we cannot assume a trusted certificate authority and a centralized repository that are used in ordinary Public-Key Infrastructure (PKI). Hence a PKI system of the web-of-trust type in which each node can issue certificates to others in a self-organizing manner has been studied. Although this system is useful for ad hoc networks, it has the problem that for authentication a node needs to find a certificate-chain to the destination node. In this paper, we formally model a web-of-trust-type PKI system, define the certificate-chain discovery problem, and propose a new distributed algorithm and its modification that solve the problem. Furthermore, we propose a measure of communication cost, and according to the measure, we compare our algorithm with an existing method by numerical computation for large-size networks and by simulation on randomly generated unit disk graphs for moderate-size networks. The simulation results show that the communication cost of the proposed method is less than 10% of the existing method.
Collision Warning Systems (CWS) can help reduce the probability and severity of car accidents by providing some sort of appropriate warning to the driver through Inter-Vehicle Communication (IVC). Especially, the CWS can help avoid collision at intersections where traffic accidents are frequent (Study Group for Promotion of ASV; Traffic Bureau, 2007). A vehicle equipped with the CWS periodically broadcasts its information, and the CWS on other vehicles use the received information to alert drivers, helping them become aware of the existence of other vehicles. To avoid collision, the CWS has concrete objectives of IVC, i.e., the CWS should receive useful information accurately and in time. Many IVC protocols including our previously proposed relay control protocol (Motegi, et al., 2006) have been developed and evaluated through traditional metrics. However, instead of using such traditional metrics directly, many requirements of the intersection CWS must be considered to judge the feasibility and practicability of IVC protocols. This paper shows performance evaluation of our previous IVC protocol developed for CWS. To study the behavior of IVC protocols, we first describe a simulation methodology including performance metrics by means of reliable and timely communications. We then use such metrics to compare our IVC protocol with the flooding protocol in large-scale simulated networks. The simulation results show that our previously proposed protocol is a good candidate for real implementation because it passes all requirements of the intersection CWS.
Program transformation by templates (Huet and Lang, 1978)is a technique to improve the efficiency of programs. In this technique, programs are transformed according to a given program transformation template. To enhance the variety of program transformation, it is important to introduce new transformation templates. Up to our knowledge, however, few works discuss about the construction of transformation templates. Chiba, et al. (2006) proposed a framework of program transformation by template based on term rewriting and automated verification of its correctness. Based on this framework, we propose a method that automatically constructs transformation templates from similar program transformations. The key idea of our method is a second-order generalization, which is an extension of Plotkin's first-order generalization (1969). We give a second-order generalization algorithm and prove the soundness of the algorithm. We then report about an implementation of the generalization procedure and an experiment on the construction of transformation templates.
Rewriting induction (Reddy, 1990) is a method to prove inductive theorems of term rewriting systems automatically. Koike and Toyama(2000) extracted an abstract principle of rewriting induction in terms of abstract reduction systems. Based on their principle, the soundness of the original rewriting induction system can be proved. It is not known, however, whether such an approach can be adapted also for more powerful rewriting induction systems. In this paper, we give a new abstract principle that extends Koike and Toyama's abstract principle. Using this principle, we show the soundness of a rewriting induction system extended with an inference rule of simplification by conjectures. Inference rules of simplification by conjectures have been used in many rewriting induction systems. Replacement of the underlying rewriting mechanism with ordered rewriting is an important refinement of rewriting induction — with this refinement, rewriting induction can handle non-orientable equations. It is shown that, based on the introduced abstract principle, a variant of our rewriting induction system based on ordered rewriting is sound, provided that its base order is ground-total. In our system based on ordered rewriting, the simplification rule extends those of the equational fragment of some major systems from the literature.
We present LCP Merge, a novel merging algorithm for merging two ordered sequences of strings. LCP Merge substitutes string comparisons with integer comparisons whenever possible to reduce the number of character-wise comparisons as well as the number of key accesses by utilizing the longest common prefixes (LCP) between the strings. As one of the applications of LCP Merge, we built a string merge sort based on recursive merge sort by replacing the merging algorithm with LCP Merge and we call it LCP Merge sort. In case of sorting strings, the computational complexity of recursive merge sort tends to be greater than O(n lg n) because string comparisons are generally not constant time and depend on the properties of the strings. However, LCP Merge sort improves recursive merge sort to the extent that its computational complexity remains O(n lg n) on average. We performed a number of experiments to compare LCP Merge sort with other string sorting algorithms to evaluate its practical performance and the experimental results showed that LCP Merge sort is efficient even in the real-world.
In this paper, we propose d-ACTM/VT, a network-based worm detection method that effectively detects hit-list worms using distributed virtual AC tree detection. To detect a kind of hit-list worms named Silent worms in a distributed manner, d-ACTM was proposed. d-ACTM detects the existence of worms by detecting tree structures composed of infection connections as edges. Some undetected infection connections, however, can divide the tree structures into small trees and degrade the detection performance. To address this problem, d-ACTM/VT aggregates the divided trees as a tree named Virtual AC tree in a distributed manner and utilizes the tree size for detection. Simulation result shows d-ACTM/VT reduces the number of infected hosts before detection by 20% compared to d-ACTM.
Previous research examined how extrinsic and intrinsic factors influence customers to shop online. Conversely, the impact of these factors on customer retention in Internet shopping has not been examined. This study is one of the few attempts to investigate the perceived benefit factors effecting customers' continuance of purchasing items through the Internet. According to an online questionnaire filled out by 1, 111 online customers to conduct a multiple regression analysis, extrinsic benefits measured in terms of time and money savings, social adjustment, and self-enhancement as well as intrinsic benefits measured in terms of pleasure and novelty as well as fashion involvement have strong effects on the continuance of purchasing. Our findings indicate that customer retention must be promoted in Internet shopping by guaranteeing not only extrinsic benefits but also intrinsic benefits. This study discusses the relevant techniques providing those benefits to customers and guidelines for future research.
Since Semantic Web is increasing in size and variety of resources, it is difficult for users to find the information that they really need. Therefore, it is necessary to provide an efficient and precise method without explicit specification for the Web resources. In this paper, we proposed the novel approach of integrating four processes for Web resource categorization. The processes can extract both the explicit relations extracted from the ontologies in a traditional way and the potential relations inferred from existing ontologies by focusing on some new challenges such as extracting important class names, using WordNet relations and detecting the methods of describing the Web resources. We evaluated the effectiveness by applying the categorization method to a Semantic Web search system, and confirmed that our proposed method achieves a notable improvement in categorizing the valuable Web resources based on incomplete ontologies.
Future networks everywhere will be connected to innumerable Internet-ready home appliances. A device accepting connections over a network must be able to verify the identity of a connecting device in order to prevent device spoofing and other malicious actions. In this paper, we propose a security mechanism for an inter-device communication. We state the importance of a distingushing and binding mechanism between a device's identity and its ownership information to realize practical inter-device authentication. In many conventional authentication systems, the relationship between the device's identity and the ownership information is not considered. Therefore, we propose a novel inter-device authentication framework guaranteeing this relationship. Our prototype implementation employs a smart card to maintain the device's identity, the ownership information and the access control rules securely. Our framework efficiently achieves secure inter-device authentication based on the device's identity, and authorization based on the ownership information related to the device. We also show how to apply our smart card system for inter-device authentication to the existing standard security protocols.
Peer-to-Peer multimedia streaming is expected to grow rapidly in the near future. Packet losses during transmission are a serious problem for streaming media as they result in degradation of the quality of service (QoS). Forward Error Correction (FEC) is a promising technique to recover the lost packets and improve the QoS of streaming media. However, FEC may degrade the QoS of all streaming due to the increased congestion caused by the FEC overhead when streaming sessions increase. Although streaming media can be categorized into live and on-demand streaming contents, conventional FEC methods apply the same FEC scheme for both contents without distinguishing them. In this paper, we clarify the effective ranges where each conventional FEC and Retransmission scheme works well. Then, we propose a novel FEC method that distinguishes two types of streaming media and is applied for on-demand streaming contents. It can overcome the adverse effect of the FEC overhead in on-demand streaming contents during media streaming and therefore reduce the packet loss due to the FEC overhead. As a result, the packet loss ratios of both live and on-demand streaming contents are improved. Moreover, it provides the QoS according to the requirements and environments of users by using layered coding of FEC. Thus, packet losses are recovered at each end host and do not affect the next-hop streaming. The numerical analyses show that our proposed method highly improves the packet loss ratio compared to the conventional method.
The performance of a network server is directly influenced by its network I/O management architecture, i.e., its network I/O multiplexing mechanism. Existing benchmark tools focus on the evaluation of high-level service performance of network servers that implement specific application-layer protocols or the evaluation of low-level communication performance of network paths. However, such tools are not suitable for performance evaluation of server architectures. In this study, we developed a benchmark tool for network I/O management architectures. We implemented five representative network I/O management mechanisms as modules: multi-process, multi-thread, select, poll, and epoll. This modularised implementation enabled quantitative and fair comparisons among them. Our experimental results on Linux 2.6 revealed that the select-based and poll-based servers had no performance advantage over the others and the multi-process and multi-thread servers achieved a high performance almost equal to that of the epoll-based server.
As increasing clock frequency approaches its physical limits, a good approach to enhance performance is to increase parallelism by integrating more cores as coprocessors to general-purpose processors in order to handle the different workloads in scientific, engineering, and signal processing applications. In this paper, we propose a many-core matrix processor model consisting of a scalar unit augmented with b×b simple cores tightly connected in a 2D torus matrix unit to accelerate matrix-based kernels. Data load/store is overlapped with computing using a decoupled data access unit that moves b×b blocks of data between memory and the two scalar and matrix processing units. The operation of the matrix unit is mainly processing fine-grained b×b matrix multiply-add (MMA) operations. We formulate the data alignment operations including matrix transposition and skewing as MMA operations in order to overlap them with data load/store. Two fundamental linear algebra algorithms are designed and analytically evaluated on the proposed matrix processor: the Level-3 BLAS kernel, GEMM, and the LU factorization with partial pivoting, the main step in solving linear systems of equations. For the GEMM kernel, the maximum speed of computing measured in FLOPs/cycle is approached for different matrix sizes, n, and block sizes, b. The speed of the LU factorization for relatively large values of n ranges from around 50-90% of the maximum speed depending on the model parameters. Overall, the analytical results show the merits of using the matrix unit for accelerating the matrix-based applications.
Skeletal parallel programming makes both parallel programs development and parallelization easier. The idea is to abstract generic and recurring patterns within parallel programs as skeletons and provide them as a library whose parallel implementations are transparent to the programmer. SkeTo is a parallel skeleton library that enables programmers to write parallel programs in C++ in a sequential style. However, SkeTo's matrix skeletons assume that a matrix is dense, so they are incapable of efficiently dealing with a sparse matrix, which has many zeros, because of duplicated computations and commutations of identical values. This problem is solved by re-formalizing the matrix data type to cope with sparse matrices and by implementing a new C++ class of SkeTo with efficient sparse matrix skeletons based on this new formalization. Experimental results show that the new skeletons for sparse matrices perform well compared to existing skeletons for dense matrices.
We study the control operators “control” and “prompt” which manage part of continuations, that is, delimited continuations. They are similar to the well-known control operators“shift” and “reset”, but differ in that the former is dynamic, while the latter is static. In this paper, we introduce a static type system for “control”and “prompt” which does not use recursive types. We design our type system based on the dynamic CPS transformation recently proposed by Biernacki, Danvy and Millikin. We also introduce let-polymorphism into our type system, and show that our type system satisfies several important properties such as strong type soundness.
We present a novel algorithm to predict transmembrane regions from a primary amino acid sequence. Previous studies have shown that the Hidden Markov Model (HMM) is one of the powerful tools known to predict transmembrane regions; however, one of the conceptual drawbacks of the standard HMM is the fact that the state duration, i.e., the duration for which the hidden dynamics remains in a particular state follows the geometric distribution. Real data, however, does not always indicate such a geometric distribution. The proposed algorithm utilizes a Generalized Hidden Markov Model (GHMM), an extension of the HMM, to cope with this problem. In the GHMM, the state duration probability can be any discrete distribution, including a geometric distribution. The proposed algorithm employs a state duration probability based on a Poisson distribution. We consider the two-dimensional vector trajectory consisting of hydropathy index and charge associated with amino acids, instead of the 20 letter symbol sequences. Also a Monte Carlo method (Forward/Backward Sampling method) is adopted for the transmembrane region prediction step. Prediction accuracies using publicly available data sets show that the proposed algorithm yields reasonably good results when compared against some existing algorithms.
This paper proposes a novel clustering method based on graph theory for analysis of biological networks. In this method, each biological network is treated as an undirected graph and edges are weighted based on similarities of nodes. Then, maximal components, which are defined based on edge connectivity, are computed and the nodes are partitioned into clusters by selecting disjoint maximal components. The proposed method was applied to clustering of protein sequences and was compared with conventional clustering methods. The obtained clusters were evaluated using P-values for GO(GeneOntology) terms. The average P-values for the proposed method were better than those for other methods.
Protein-protein interactions play an important role in a number of biological activities. We developed two methods of predictingprotein-protein interaction site residues. One method uses only sequence information and the other method uses both sequence and structural information. We used support vector machine (SVM) with a position specific scoring matrix (PSSM) as sequence information and accessible surface area(ASA) of polar and non-polar atoms as structural information. SVM is used in two stages. In the first stage, an interaction residue is predicted by taking PSSMs of sequentially neighboring residues or taking PSSMs and ASAs of spatially neighboring residues as features. The second stage acts as a filter to refine the prediction results. The recall and precision of the predictor using both sequence and structural information are 73.6% and 50.5%, respectively. We found that using PSSM instead of frequency of amino acid appearance was the main factor of improvement of our methods.
Comparative analysis of organisms with metabolic pathways gives important information about functions within organisms. In this paper, we propose a new method for comparing the metabolic pathways with reaction structures that include important enzymes. In this method, subgraphs from pathways that include `choke point' or `load point' are extracted as important “reaction structures, ” and a “reaction structure profile, ” which represents whether extracted reaction structures are observed in the metabolic pathway of other organisms, is created. Distance regarding function within organisms between species is defined using the “reaction structure profile.”By applying the proposed method to the metabolic networks of 64 representative organisms selected from Archaea, Eubacteria and Eukaryote in the KEGG database, we succeed in reconstructing a phylogenetic tree, and confirm the effectiveness of the method.
Chemical and biological activities of compounds provide valuable information for discovering new drugs. The compound fingerprint that is represented by structural information of the activities is used for candidates for investigating similarity. However, there are several problems with predicting accuracy from the requirement in the compound structural similarity. Although the amount of compound data is growing rapidly, the number of well-annotated compounds, e.g., those in the MDL Drug Data Report (MDDR)database, has not increased quickly. Since the compounds that are known to have some activities of a biological class of the target are rare in the drug discovery process, the accuracy of the prediction should be increased as the activity decreases or the false positive rate should be maintained in databases that have a large number of un-annotated compounds and a small number of annotated compounds of the biological activity. In this paper, we propose a new similarity scoring method composed of a combination of the Tanimoto coefficient and the proximity measure of random forest. The score contains two properties that are derived from unsupervised and supervised methods of partial dependence for compounds. Thus, the proposed method is expected to indicate compounds that have accurate activities. By evaluating the performance of the prediction compared with the two scores of the Tanimoto coefficient and the proximity measure, we demonstrate that the prediction result of the proposed scoring method is better than those of the two methods by using the Linear Discriminant Analysis (LDA) method. We estimate the prediction accuracy of compound datasets extracted from MDDR using the proposed method. It is also shown that the proposed method can identify active compounds in datasets including several un-annotated compounds.
The number of biological databases has been increasing rapidly as a result of progress in biotechnology. As the amount and heterogeneity of biological data increase, it becomes more difficult to manage the data in a few centralized databases. Moreover, the number of sites storing these databases is getting larger, and the geographic distribution of these databases has become wider. In addition, biological research tends to require a large amount of computational resources, i.e., a large number of computing nodes. As such, the computational demand has been increasing with the rapid progress of biological research. Thus, the development of methods that enable computing nodes to use such widely-distributed database sites effectively is desired. In this paper, we propose a method for providing data from the database sites to computing nodes. Since it is difficult to decide which program runs on a node and which data are requested as their inputs in advance, we have introduced the notion of “data-staging” in the proposed method. Data-staging dynamically searches for the input data from the database sites and transfers the input data to the node where the program runs. We have developed a prototype system with data-staging using grid middleware. The effectiveness of the prototype system is demonstrated by measurement of the execution time of similarity search of several-hundred gene sequences against 527 prokaryotic genome data.
We accelerate the time-consuming iterations for projective reconstruction, a key component of self-calibration for computing 3-D shapes from feature point tracking over a video sequence. We first summarize the algorithms of the primal and dual methods for projective reconstruction. Then, we replace the eigenvalue computation in each step by the power method. We also accelerate the power method itself. Furthermore, we introduce the SOR method for accelerating the subspace fitting involved in the iterations. Using simulated and real video images, we demonstrate that the computation sometimes becomes several thousand times faster.
This paper proposes a novel method, Hierarchical Importance Sampling (HIS) that can be used instead of population convergence in evolutionary optimization based on probability models (EOPM)such as estimation of distribution algorithms and cross entropy methods. In HIS, multiple populations are maintained simultaneously such that they have different diversities, and the probability model of one population is built through importance sampling by mixing with the other populations. This mechanism can allow populations to escape from local optima. Experimental comparisons reveal that HIS outperforms general EOPM.