This paper proposes a method to mine rules from a software project data set that contains a number of quantitative attributes such as staff months and SLOC. The proposed method extends conventional association analysis methods to treat quantitative variables in two ways: (1) the distribution of a given quantitative variable is described in the consequent part of a rule by its mean value and standard deviation so that conditions producing the distinctive distributions can be discovered. To discover optimized conditions, (2) quantitative values appearing in the antecedent part of a rule are divided into contiguous fine-grained partitions in preprocessing, then rules are merged after mining so that adjacent partitions are combined. The paper also describes a case study using the proposed method on a software project data set collected by Nihon Unisys Ltd. In this case, the method mined rules that can be used for better planning and estimation of the integration and system testing phases, along with criteria or standards that help with planning of outsourcing resources.
We propose a low-stretch scheme for locating mobile objects in wide-area computing environments. Locating mobile objects in distributed computing systems is a non-trivial problem and has been investigated for decades. The forwarding address algorithm, perhaps the most popular algorithm, requires the previous holder of the object to point to the successive holder, and to forward all requests along this pointer. However, this approach cannot provide any access stretch bounds for wide-area settings, and can incur unlimited communication overhead. This is unacceptable when a large number of objects simultaneously move or when numerous referencers attempt to access an object that has moved. We propose an active update method where nodes in the vicinity of the object's location are notified of its new location via localized update messages. Moreover, we will utilize the overlay topology information to minimize these messages. Referencers beyond the scope of the update will still be able to safely access the object. We will demonstrate that these updates maintain access stretches low even in wide-area settings.
In large-scale networks, users want to be able to communicate securely with each other over a channel that is unreliable. When the existing 2- and 3-party protocols are realized in this situation, there are several problems: a client must hold many passwords and the load on the server concerning password management is heavy. In this paper, we define a new ideal client-to-client general authenticated key exchange functionality, where arbitrary 2-party key exchange protocols are applicable to protocols between the client and server and between servers. We also propose a client-to-client general authenticated key exchange protocol C2C-GAKE as a general form of the client-to-client model, and a client-to-client hybrid authenticated key exchange protocol C2C-HAKE as an example protocol of C2C-GAKE to solve the above problems. In C2C-HAKE, a server shares passwords only with clients in the same realm respectively, public/private keys are used between respective servers, and two clients between different realms share a final session key via the respective servers. Thus, with regard to password management in C2C-HAKE, the load on the server can be distributed to several servers. In addition, we prove that C2C-HAKE securely realizes the above functionality. C2C-HAKE is the first client-to-client hybrid authenticated key exchange protocol that is secure in a universally composable framework with a security-preserving composition property.
Though anonymity of ring signature schemes has been studied in many publications, these papers gave different definitions and there has been no consensus. Recently, Bender, et al. proposed two new anonymity definitions of ring signature schemes which are stronger than the previous definitions, that are called anonymity against attribution attacks/full key exposure. In addition, ring signature schemes have two levels of definitions for unforgeability definitions, i.e., existential unforgeability and strong existential unforgeability. In this paper, we will redefine anonymities and unforgeabilities within the universally composable (UC) security framework. First, we will give new ideal functionalities of ring signature schemes for each security level separately. Next, we will show the relations between game-based security definitions and our UC definitions. Finally, we will give another proof for the security of the Bender, et al.'s ring signature scheme within the UC framework. A simulator we constructed in this proof can easily simulate an adversary of existential unforgeability, which can be adaptable to the case of strong existential unforgeability if we assume the exploited signature scheme is a standard single strong existentially unforgeable signature scheme.
Knudsen and Meier applied the χ2-attack to RC6. The χ2-attack recovers a key by using high correlations measured by χ2-value. The best χ2-attacks to RC6 whose security is guaranteed theoretically works on 16-round RC6 with 192- and 256-bit key but just 8-round RC6 with 128-bit key, because it recovers keys of RC6 symmetrically, which requires a time complexity of #plaintexts × 254 and a memory complexity of 280 for recovering one key. In this paper, we improve the χ2-attack to reduce the time complexity. We give the theorem that evaluates the success probability of the χ2-attack on RC6 without using any experimental result. Our key recovery attack recovers keys asymmetrically, which requires a time complexity of #plaintexts × 231 and a memory complexity of 252 for recovering one key. As a result, our key recovery attack works on 16-round RC6 with 192- and 256-bit key and 12-round RC6 with 128-bit key. In the case both of 196- and 256-bit keys, our attack surprisingly reduces the time and memory complexity compared with that of the previous attack. We also demonstrate our theorem on RC6-8/4/8 and make sure of the accuracy by comparing our approximation with the experimental results.
This paper describes how the bootstrap approach to statistics can be applied to the evaluation of IR effectiveness metrics. More specifically, we describe straightforward methods for comparing the discriminative power of IR metrics based on Bootstrap Hypothesis Tests. Unlike the somewhat ad hoc Swap Method proposed by Voorhees and Buckley, our Bootstrap Sensitivity Methods estimate the overall performance difference required to achieve a given confidence level directly from Bootstrap Hypothesis Test results. We demonstrate the usefulness of our methods using four different data sets (i.e., test collections and submitted runs) from the NTCIR CLIR track series for comparing seven IR metrics, including those that can handle graded relevance and those based on the Geometric Mean. We also show that the Bootstrap Sensitivity results are generally consistent with those based on the more ad hoc methods.
We model the relationships between the message formats of a business system and their semantics in a machine-processable knowledge base. We describe a message-mapping technique that extracts the relationships between the message formats of several systems semiautomatically by using the class characteristics of the semantics and stores these relationships as past system design knowledge. In addition, we propose process-mapping, which is a technique that discovers suitable software components for system cooperation. We evaluate these methods using the interface specifications of actual service management systems and show that the frequency of interface adjustment can be reduced.
In this paper, an architecture of software environment to offload user-defined software modules to Maestro2 cluster network, named Maestro dynamic offloading mechanism (MDO), is described. Maestro2 is a high-performance network for clusters. The network interface and the switch of Maestro2 have a general-purpose processor tightly coupled with a dedicated communication hardware. MDO enables the users to offload software modules to both the network interface and the switch. MDO includes a wrapper library with which offload modules can be executed on a host machine without rewriting the program. The overhead and the effectiveness of MDO are evaluated by offloading collective communications.
The matching of a bipartite graph is a structure that can be seen in various assignment problems and has long been studied. The semi-matching is an extension of the matching for a bipartite graph G =(U ∪ V, E). It is defined as a set of edges, M ⊆ E, such that each vertex in U is an endpoint of exactly one edge in M. The load-balancing problem is the problem of finding a semi-matching such that the degrees of each vertex in V are balanced. This problem is studied in the context of the task scheduling to find a “balanced” assignment of tasks for machines, and an O(¦E¦¦U¦) time algorithm is proposed. On the other hand, in some practical problems, only balanced assignments are not sufficient, e.g., the assignment of wireless stations (users)to access points (APs) in wireless networks. In wireless networks, the quality of the transmission depends on the distance between a user and its AP; shorter distances are more desirable. In this paper, We formulate the min-weight load-balancing problem of finding a balanced semi-matching that minimizes the total weight for weighted bipartite graphs. We then give an optimal condition of weighted semi-matchings and propose an O(¦E¦¦U¦¦V¦) time algorithm.
This work proposes a method to control the dominance area of solutions in order to induce appropriate ranking of solutions for the problem at hand, enhance selection, and improve the performance of MOEAs on combinatorial optimization problems. The proposed method can control the degree of expansion or contraction of the dominance area of solutions using a user-defined parameter S. Modifying the dominance area of solutions changes their dominance relation inducing a ranking of solutions that is different to conventional dominance. In this work we use 0/1 multiobjective knapsack problems to analyze the effects on solutions ranking caused by contracting and expanding the dominance area of solutions and its impact on the search performance of a multi-objective optimizer when the number of objectives, the size of the search space, and the feasibility of the problems vary. We show that either convergence or diversity can be emphasized by contracting or expanding the dominance area. Also, we show that the optimal value of the area of dominance depends strongly on all factors analyzed here: number of objectives, size of the search space, and feasibility of the problems.
In this study, we use temporal difference learning (TDL) to investigate the ability of 20 different artificial neural network (ANN) architectures to learn othello game board evaluation functions. The ANN evaluation functions are applied to create a strong othello player using only 1-ply search. In addition to comparing many of the ANN architectures seen in the literature, we introduce several new architectures that consider the game board symmetry. Both embedding the game board symmetry into the network architecture through weight sharing and the outright removal of symmetry through symmetry removal are explored. Experiments varying the number of inputs per game board square from one to three, the number of hidden nodes, and number of hidden layers are also performed. We found it advantageous to consider game board symmetry in the form of symmetry by weight sharing; and that an input encoding of three inputs per square outperformed the one input per square encoding that is commonly seen in the literature. Furthermore, architectures with only one hidden layer were strongly outperformed by architectures with multiple hidden layers. A standard weighted-square board heuristic evaluation function from the literature was used to evaluate the quality of the trained ANN othello players. One of the ANN architectures intro-duced in this study, an ANN implementing weight sharing and consisting of three hidden layers, using only a 1-ply search, outperformed a weighted-square test heuristic player using a 6-ply minimax search.
Almost all traditional methods of advertisement distribution have been concerned only with primary information distribution via certain kinds of media. However, the rapid growth of the Internet and interactive media have demonstrated the power and efficiency of secondary information distribution of information by consumers such as words of mouth and free-mail. However, an advertisement distribution model which can be used to analyze and measure the effectiveness of such secondary distribution has never been discussed. Therefore, in this paper, we propose an advertisement distribution model and show how to use the model to analyze both primary and secondary information distributions. Our experiment and analytical results are also discussed. The experimental result shows that the proposed model can be used to measure and analyze the effectiveness of advertisement distribution.
Although there were in-depth discussions in the 1970s on the question of whether the human visual system contains 'curvature detectors' or contour detectors, which respond to the tangents of curves [Blake-more 74], they yielded no definite conclusions. Until now, the end-stopped cell model of curve detection has been the predominant one [Dobbins 87]. However, this model detects curvature with a low degree of accuracy, so a better model is required. Long ago, people discovered that the human brain is a network of numerous neurons. The hypothesis of achieving a highly accurate calculation of curvature through a network composed of biological elements (simple cells) is readily accepted. However, neither Blakemore et al. nor Dobbins et al. explain the function of simple cells in the calculation of curvature. This article illustrates the function of simple cells in calculating curvature. Moreover, in this article we attempt to construct a computational model for describing the mechanism for calculating curvatures along suggestions of Blakemore and Over. This model gives a key for answering to a question why the Helmholtz irradiation disappears when two squares are replaced by two circles.
Traditional information retrieval evaluation relies on both precision and recall. However, modern search environments such as the Web, in which recall is either unimportant or immeasurable, require precision-oriented evaluation. In particular, finding one highly relevant document is very important for practical tasks such as known-item search and suspected-item search. This paper compares the properties of five evaluation metrics that are applicable to the task of finding one highly relevant document in terms of the underlying assumptions, how the system rankings produced resemble each other, and discriminative power. We employ two existing methods for comparing the discriminative power of these metrics: The Swap Method proposed by Voorhees and Buckley at ACM SIGIR 2002, and the Bootstrap Sensitivity Method proposed by Sakai at SIGIR 2006. We use four data sets from NTCIR to show that, while P(+)-measure, O-measure and NWRR (Normalised Weighted Reciprocal Rank)are reasonably highly correlated to one another, P(+)-measure and O-measure are more discriminative than NWRR, which in turn is more discriminative than Reciprocal Rank. We therefore conclude that P(+)-measure and O-measure, each modelling a different user behaviour, are the most useful evaluation metrics for the task of finding one highly relevant document.
Searching in a spatial database for 3D objects that are similar to a given object is an important task that arises in a number of database applications, for example, in medicine and CAD fields. Most of the existing similarity searching methods are based on global features of 3D objects. Developing a feature set or a feature vector of 3D object using their partial features is a challenging. In this paper, we propose a novel segment weight vector for matching 3D objects rapidly. We also describe a partial and geometrical similarity based solution to the problem of searching for similar 3D objects. As the first step, we split each 3D object into parts according to its topology. Next, we introduce a new method to extract the thickness feature of each part of every 3D object to generate its feature vector and a novel searching algorithm using the new feature vector. Finally, we present a novel solution for improving the accuracy of the similarity queries. We also present a performance evaluation of our stratagem. The experiment result and discussion indicate that the proposed approach offers a significant performance improvement over the existing approach. Since the proposed method is based on partial features, it is particularly suited to searching objects having distinct part structures and is invariant to part architecture.
As an international language, English has become more and more important for nonnative speakers. Therefore, writers ought to consider the needs of non-native speakers, i.e. write English in a way that can be understood quite well by non-native audience. In this paper, we investigate the position of six discourse markers within the texts whose target audience was intermediate non-native speakers of English. The six discourse markers are: because and since, which represent “reason” relation; if and when, which represent “condition” relation; although and while, which represent “concession”/“contrast” relation. First, we created a corpus (200,000 words) containing the texts (domain: natural and pure science) whose target audience was intermediate non-native speakers. We selected 1072 examples of the six discourse markers from the corpus, and annotated them. Second, a machine learning program C4.5 was applied to induce the classification models of the position of the discourse markers. And then we used Support Vector Machine (SVM) to verify the experiment results of C4.5. To our knowledge, this study is the first one on exploring the position of discourse markers within the texts whose target audience was intermediate non-native speakers. The experiment results can be applied to text generation and homepage creation for intermediate non-native speakers of English.
We propose an application-independent Sinhala character input method called Sri Shell with a principled key assignment based on phonetic transcription of Sinhala characters. A good character input method should fulfill two criteria, efficiency and user-friendliness. We have introduced several quantification methods to quantify the efficiency and user-friendliness of Sinhala character input methods. Experimental results prove the efficiency and user-friendliness of our proposed method.
AD LOC is a novel system for mobile devices to collaboratively tie persistent virtual notes to physical locations. Notes that are relevant to particular locations can be created and then cached using serendipitously formed one-hop wireless ad hoc network connections. The location provides an address to which the information is relevant and devices attempt to keep the information stored at devices which remain close to this address. In simulation experiments, even under conditions of nodal flux, it is possible to retrieve over 90% of the notes that have been stored.
The current 3G-Cellular radio access network cannot support many concurrent high data rate unicast or multicast flows due to limited radio resources. We have proposed a heterogeneous wireless network architecture intended for point-to-multipoint services, to improve the availability of such services to mobile users. The architecture consists of a 3G-Cellular network, supported by a number of local ad hoc networks that are established on demand. In this framework the 3G multipoint-channel range is reduced while the unicast and signalling connections are maintained. Local ad hoc networks are used to forward the multicast data onto users located outside the shortened 3G multicast-channel range. In this paper we present a performance analysis of multicast streaming on the heterogeneous network architecture. The simulation results are complemented with a sensitivity analysis identifying the impact that parameters like node mobility and traffic patterns will have. The results verify that the architecture and the routing protocol are able to provide multicast services with acceptable quality to the multicast subscribers, while conserving 3G-Cellular radio resources.
Node localization obtained by estimating node positions is an essential technique for wireless multi-hop networks. In this paper, we present an optimized link state routing (OLSR)-based localization (ROULA) that satisfies the following key design requirements: (i) independency from anchor nodes, (ii) robustness for non-convex network topology, and(iii) compatibility with network protocol. ROULA is independent from anchor nodes and can obtain the correct node positions in non-convex network topology. In addition, ROULA is compatible with the OLSR network protocol, and it uses the inherent distance characteristic of multipoint relay (MPR) nodes. We reveal the characteristics of MPR selection and the farthest 2-hop node selection used in ROULA, and describe how these node selections contribute to reducing the distance error for a localization scheme without using ranging devices. We used a simulation to specify appropriate MPR_COVERAGE, which is defined to control the number of MPR nodes in OLSR, and give a comparative performance evaluation of ROULA for various scenarios including non-convex network topology and various deployment radii of anchor nodes. Our evaluation proves that ROULA achieves desirable performance in various network scenarios.
This paper proposes human activity recognition based on the actual semantics of the human's current location. Since no predefined semantics of location can adequately identify human activity, we automatically identify the semantics from things by focusing on the association between things and human activities with the things. Ontology is used to deal with the various possible representations (terms) of each thing, identified by a RFID tag, and a multi-class Naive Bayesian approach is applied to detect multiple actual semantics from the terms. Our approach is suitable for automatically detecting possible activities even given a variety of object characteristics including multiple representations and variability. Simulations with actual thing datasets and experiments in an actual environment demonstrate its noise tolerance and ability to rapidly detect multiple actual semantics from existing things.