Structured overlay networks that support range queries cannot hash data IDs for load balancing, in order to preserve the total order on the IDs. Since data and queries are not equally distributed on the ID-space without hashing in range-based overlay networks, uneven loads are imposed on the overlay nodes. Existing load balancing techniques for range-based overlay networks distribute the loads by using data reallocation or node migration, which makes the networks very unstable due to heavy data reallocation or frequent churn. This paper proposes a novel scheme that distributes, fairly, the loads without node migration and with little data reallocation, by sharing some ID-space regions between neighboring nodes. Our “overlapping” ID-space management scheme derives the optimal overlap based on kernel density estimations; the query loads based on the statistical theory are used to calculate the best overlap regions. This calculation is executed in a distributed manner with no central coordinator. We conduct thorough computer simulations, and show that our scheme alleviates the worst node load by 20-90% against existing techniques without node migration and with least data reallocation.
One of the properties of context-oriented programming languages is the composition of partial module definitions. While in most such language extensions the state and behavior introduced by partial definitions are treated equally at the module level, we propose a refinement of that approach to allow for both public and restricted visibility of methods and local and shared visibility of fields in our experimental language L. Furthermore, we propose a new lookup mechanism to reduce the risk of name captures.
Live virtual machine (VM) migration (simply live migration) is a powerful tool for managing data center resources. Live migration moves a running VM between different physical machines without losing any states such as network conditions and CPU status. Live migration has attracted the attention of academic and industrial researchers since replacing running VMs inside data centers by live migration makes it easier to manage data center resources. This paper summarizes live migration basics and techniques for improving them. Specifically, this survey focuses on software mechanisms for realizing basic live migration, improving its performance, and expanding its applicability. Also, this paper shows research opportunities that the state-of-the-art live migration techniques have not covered yet.
A linear-time reversible self-interpreter in an r-Turing complete reversible imperative language is presented. The proposed imperative language has reversible structured control ow operators and symbolic tree-structured data (S-expressions). The latter data structures are dynamically allocated and enable reversible simulation of programs of arbitrary size and space consumption. As self-interpreters are used to show a number of fundamental properties in classic computability and complexity theory, the present study of an efficient reversible self-interpreter is intended as a basis for future work on reversible computability and complexity theory as well as programming language theory for reversible computing. Although the proposed reversible interpreter consumes superlinear space, the restriction of the number of variables in the source language leads to linear-time reversible simulation.
Parallel corpora are crucial for statistical machine translation (SMT); however, they are quite scarce for most language pairs and domains. As comparable corpora are far more available, many studies have been conducted to extract parallel sentences from them for SMT. Parallel sentence extraction relies highly on bilingual lexicons that are also very scarce. We propose an unsupervised bilingual lexicon extraction based parallel sentence extraction system that first extracts bilingual lexicons from comparable corpora and then extracts parallel sentences using the lexicons. Our bilingual lexicon extraction method is based on a combination of topic model and context based methods in an iterative process. The proposed method does not rely on any prior knowledge, and the performance can be improved iteratively. The parallel sentence extraction method uses a binary classifier for parallel sentence identification. The extracted bilingual lexicons are used for the classifier to improve the performance of parallel sentence extraction. Experiments conducted with the Wikipedia data indicate that the proposed bilingual lexicon extraction method greatly outperforms existing methods, and the extracted bilingual lexicons significantly improve the performance of parallel sentence extraction for SMT.
Traditional machine-learning-based approaches to temporal relation classification use only local features, i.e., those relating to a specific pair of temporal entities (events and temporal expressions), and thus fail to incorporate useful information that could be inferred from nearby entities. In this paper, we use timegraphs and stacked learning to perform temporal inference for classification in the temporal relation classification task. In our model, we predict a temporal relation by considering the consistency of possible relations between nearby entities. Performing 10-fold cross-validation on the Timebank corpus, we achieve an F1 score of 60.25% using a graph-based evaluation, which is 0.90 percentage points higher than that of the local approach, outperforming other proposed systems.
Sharing information can be easy, but sharing emotion is much more difficult. One approach to sharing emotion effectively is to present the receiver with past similar experiences of his/her own. This study proposes an emotion sharing model by identifying the receiver's experience that best matches that of the sender as indicated by the message being sent. Experiences are found in life log data and structured to permit quantitative comparisons. In this report, the target that the sender wants to share with the receiver consists of “contents”, “experience”, “emotion”. This paper describes how to structure experiences and identify equivalent experiences. To achieve this, it elucidates the cross-correlation relations of the equivalency of experiences and the equivalency of emotions. Trials confirmed its accuracy in terms of experience matching.
The spinal tree adjoining grammar (TAG) parsing model of [Carreras 08] achieves the current state-of-the-art constituent parsing accuracy on the commonly used English Penn Treebank evaluation setting. Unfortunately, the model has the serious drawback of low parsing efficiency since its Eisner-CKY style parsing algorithm needs O(n4) computation time for input length n. This paper investigates a more practical solution and presents a beam search shift-reduce algorithm for spinal TAG parsing. Since the algorithm works in O(bn) (b is beam width), it can be expected to provide a significant improvement in parsing speed. However, to achieve faster parsing, it needs to prune a large number of candidates in an exponentially large search space and often suffers from severe search errors. In fact, our experiments show that the basic beam search shift-reduce parser does not work well for spinal TAGs. To alleviate this problem, we extend the proposed shift-reduce algorithm with two techniques: Dynamic Programming of [Huang 10a] and Supertagging. The proposed extended parsing algorithm is about 8 times faster than the Berkeley parser, which is well-known to be fast constituent parsing software, while offering state-of-the-art performance. Moreover, we conduct experiments on the Keyaki Treebank for Japanese to show that the good performance of our proposed parser is language-independent.
In this article, we present an incremental dependency parsing algorithm with an arc-eager variant of the left-corner parsing strategy. Our algorithm's stack depth captures the center-embeddedness of the recognized dependency structure. A higher stack depth occurs only when processing deeper center-embedded sentences in which people find difficulty in comprehension. We examine whether our algorithm can capture the syntactic regularity that universally exists in languages through two kinds of experiments across treebanks of 19 languages. We first show through oracle parsing experiments that our parsing algorithm consistently requires less stack depth to recognize annotated trees relative to other algorithms across languages. This result also suggests the existence of a syntactic universal by which deeper center-embedding is a rare construction across languages, a result that has yet to be quantitatively cross-linguistically examined. We further investigate the above claim through supervised parsing experiments and show that our proposed parser is consistently less sensitive to constraints on stack depth bounds when decoding across languages, while the performance of other parsers such as the arc-eager parser is largely affected by such constraints. We thus conclude that the stack depth of our parser represents a more meaningful measure for capturing syntactic regularity in languages than those of existing parsers.
Proximity of query keyword occurrences is one important evidence which is useful for effective querybiased document scoring. If a query keyword occurs close to another in a document, it suggests high relevance of the document to the query. The simplest way to measure proximity between keyword occurrences is to use distance between them, i.e., difference of their positions. However, most web pages contain hierarchical structure composed of nested logical blocks with their headings, and it affects logical proximity. For example, if a keyword occurs in a block and another occurs in the heading of the block, we should not simply measure their proximity by their distance. This is because a heading describes the topic of the entire corresponding block, and term occurrences in a heading are strongly connected with any term occurrences in its associated block with less regard for the distance between them. Based on these observations, we developed a heading-aware proximity measure and applied it to three existing proximity-aware document scoring methods: MinDist, P6, and Span. We evaluated these existing methods and our modified methods on the data sets from TREC web tracks. The results indicate that our heading-aware proximity measure is better than the simple distance in all cases, and the method combining it with the Span method achieved the best performance.
Chinese word segmentation is an initial and important step in Chinese language processing. Recent advances in machine learning techniques have boosted the performance of Chinese word segmentation systems, yet the identification of out-of-vocabulary words is still a major problem in this field of study. Recent research has attempted to address this problem by exploiting characteristics of frequent substrings in unlabeled data. We propose a simple yet effective approach for extracting a specific type of frequent substrings, called maximized substrings, which provide good estimations of unknown word boundaries. In the task of Chinese word segmentation, we use these substrings which are extracted from large scale unlabeled data to improve the segmentation accuracy. The effectiveness of this approach is demonstrated through experiments using various data sets from different domains. In the task of unknown word extraction, we apply post-processing techniques that effectively reduce the noise in the extracted substrings. We demonstrate the effectiveness and efficiency of our approach by comparing the results with a widely applied Chinese word recognition method in a previous study.
In this paper we describe a generalized dependency tree language model for machine translation. We consider in detail the question of how to define tree-based n-grams, or ‘t-treelets’, and thoroughly explore the strengths and weaknesses of our approach by evaluating the effect on translation quality for nine major languages. In addition, we show that it is possible to attain a significant improvement in translation quality for even non-structured machine translation by reranking filtered parses of k-best string output.
We expect that even elderly people who are novice players will be willing to play a keyboard instrument if there are appropriate supports for them. We conducted experiments in which participants played a keyboard while watching a model performance video without a sheet music. Then, we got some requirements of a practice method to be lower threshold for elderly novice people. In response to the requirements, we presented a practice method in which a user plays the keyboard while setting his or her hands on hands of a CG (Computer Graphics) model performance using a half-mirror. This practice method allows even elderly novice people to easily hit the correct keys if the model performance is represented. Then, the elderly novice people might have some room to reflect on their performance independently. The ease of representing correct notes with the proper fingering and having the room to reflect on their performance can provide elderly novice people the motivation to play the keyboard. As a result, they can experience of accomplishment by continuously playing the keyboard.
In our article we try to promote the measurement of four measures for the quantification of social bonding between humans and robots. We use these measures to assess social bonding in the presence of verbal and gestural interactions in “proactive” and “reactive” versions of a minimally designed accompanying robot called ROBOMO. The approach aims to measure four factors which are: “belief ”, “attachment”, “commitment” and “involvement”. This is achieved by assigning different metrics to each of the four different factors. Our proposed approach is evaluated in a human-robot interaction (HRI) study while ROBOMO provides some gestures and produces inarticulate utterances as verbal indicators. As a first step, we try to validate the proposed four measures of social bonding. For that, we compare the social bonding values in the following different conditions: “robot using only gestures”, “robot using only verbal behaviors”, “robot combining gestural and verbal behaviors”. Since it is obvious that combining verbal and gestural behaviors may increase the user's preference of the robot and thus social bonding, the proposed metrics values have to increase in such conditions. We show, based on the results, that indeed, the different four elements measuring social bonding increased in such q condition accordingly, making our measures for the quantification of social bonding reliable. In addition, this also makes our proposed metrics useful for assessing social bonding when we decide to add future behaviors and modifications to the designed robot. For the second step, we selected two types of behavior modes being “proactive” and “reactive” and we used our proposed validated different metrics to define which behavior mode may lead to higher social bonding. Based on the results, we show that in the context of a minimally designed accompanying robot, a proactive mode adopted by the robot is preferred over a reactive mode. In fact, it leads to an amelioration of the social bonding.
Trading dialogs are a kind of negotiation in which an exchange of ownership of items is discussed, and these kinds of dialogs are pervasive in many situations. Recently, there has been an increasing amount of research on applying reinforcement learning (RL) to negotiation dialog domains. However, in previous research, the focus was on negotiation dialog between two participants only, ignoring cases where negotiation takes place between more than two interlocutors. In this paper, as a first study on multi-party negotiation, we apply RL to a multi-party trading scenario where the dialog system (learner) trades with one, two, or three other agents. We experiment with different RL algorithms and reward functions. We use Q-learning with linear function approximation, least-squares policy iteration, and neural fitted Q iteration. In addition, to make the learning process more efficient, we introduce an incremental reward function. The negotiation strategy of the learner is learned through simulated dialog with trader simulators. In our experiments, we evaluate how the performance of the learner varies depending on the RL algorithm used and the number of traders. Furthermore, we compare the learned dialog policies with two strong hand-crafted baseline dialog policies. Our results show that (1) even in simple multi-party trading dialog tasks, learning an effective negotiation policy is not a straightforward task and requires a lot of experimentation; and (2) the use of neural fitted Q iteration combined with an incremental reward function produces negotiation policies as effective or even better than the policies of the two strong hand-crafted baselines.