In this challenge, we develop and distribute an integrated environment to flexibly combine multiple text mining techniques. Text mining techniques include numerous tasks such as salient sentence extraction, keyword extraction, topic extraction, textual coherence evaluation, multi-document summarization, and text clustering. Although tools that individually perform one or more of the above-mentioned tasks exist, it is difficult to integrate and activate multiple tools for a particular task. We attempt to provide the flexibility to integrate numerous tools that exist in the community in our proposed text mining environment. Users can use a customized version of the proposed text mining environment for their specific tasks, thereby concentrating solely on their creative work.
The accuracy of active learning is crucially influenced by the existence of noisy labels given by a real-world noisy oracle. In this paper, we propose a novel pool-based active learning framework through density power divergence. It is known that density power divergence, such as β -divergence and γ-divergence, can be accurately estimated even under the existence of outliers (noisy labels) within data. In addition, we propose an evaluation scheme for these measures based on those asymptotic statistical analyses, which enables us to perform active learning by evaluating an estimation variance. Experiments on artificial and real-world datasets show that our active learning scheme performs better than state-of-the-art methods.
Learning to rank is a supervised learning problem whose goal is to construct a ranking model. In recent years, online learning to rank algorithms have begun to attract attention because large-scale datasets have become available. We propose a selective pairwise approach to online learning to rank algorithms that offer both fast learning and high performance. The basic strategy of our method is to select the most effective document pair to minimize the objective function using an entered query present in the training data, and then updates the current weight vector by using only the selected document pair instead of using all document pairs in the query. The main characteristics of our method are that it utilizes adaptive margin rescaling based on the approximated NDCG to reflect the IR evaluation measure, the max-loss update procedure, and ramp loss to reduce the over-fitting problem. Finally, we implement our proposal, PARank-NDCG, in the framework of the Passive-Aggressive algorithm. We conduct experiments on the MSLR-WEB datasets, which contain 10,000 and 30,000 queries. Our experiments show that PARank-NDCG outperforms conventional algorithms including online learning to rank algorithms such as Stochastic Pairwise Descent, Committee Perceptron and batch algorithm such as RankingSVM on NDCG values. In addition, our method only takes 7 seconds to learn a model on the MSLR-WEB10K dataset. PARank-NDCG offers approximately 63 times faster training than RankingSVM on average.
The dynamic constraint satisfaction problem (DynCSP) is a sequence of CSP instances. By introducing a notion of decision transition costs, one natural optimization problem results, where we search for a sequence of solutions that minimizes a total sum of decision transition costs. We will refer to this problem as the dynamic constraint satisfaction problem with decision transition costs (DynCSP-DTC). Previously, Hatano and Hirayama have presented an integer linear programming formulation to apply Lagrangian decomposition to the SAT-version of the problem called Dynamic SAT with decision change costs(DynSAT-DCC). However, since their linear encoding of decision change costs was specially designed for DynSAT, a new encoding method is required when we try to extend Lagrangian decomposition to solve general DynCSP-DTC. In this paper, we will introduce the quadratic encoding of decision transition costs that enables Lagrangian decomposition to work on general DynCSP-DTC including DynSAT-DCC. Furthermore, we empirically show that, even on DynSAT-DCC, Lagrangian decomposition with quadratic encoding performs more efficiently than other methods.
Distributed Constraint Optimization problems (DCOPs) have been studied as a fundamental model of multi-agent cooperation. In traditional DCOPs, all agents cooperate to optimize the sum of their cost functions. However, in practical systems some agents may desire to select the value of their variables without cooperation. In special cases, such agents may take the values with the worst impact on the quality of the result reachable by the optimization process. Similar classes of problems have been studied as Quantified (Distributed) Constraint Problems, where the variables of the CSP have existential/universal quantifiers. All constraints should be satisfied independently of the value taken by universal variables. In this paper, a Quantified Distributed Constraint Optimization problem (QDCOP) that extends the framework of DCOPs is presented. We apply existential/universal quantifiers to distinct uncooperative variables. A universally quantified variable is left unassigned by the optimization as the result has to hold when it takes any value from its domain, while an existentially quantified variable takes exactly one of its values for each context. We consider that the QDCOP applies the concept of game tree search to DCOP. If the original problem is a minimization problem, agents that own universally quantified variables may intend to maximize the cost value in the worst case. Other agents normally intend to optimize the minimizing problems. Therefore, only the bounds, especially the upper bounds, of the optimal value are guaranteed. The purpose of the new class of problems is to compute such bounds, as well as to compute sub-optimal solutions. For the QDCOP, we propose solution methods that are based on min-max/alpha-beta and ADOPT algorithms.
Many real world problems involve multiple criteria that should be considered separately and optimized simultaneously. A Multi-Objective Constraint Optimization Problem (MO-COP) is the extension of a mono-objective Constraint Optimization Problem (COP). In a MO-COP, it is required to provide the most preferred solution for a user among many optimal solutions. In this paper, we develop a novel Interactive Algorithm for MO-COP (MO-IA). The characteristics of this algorithm are as follows: (i) it can guarantee to find a Pareto solution, (ii) it narrows a region, in which Pareto front may exist, gradually, (iii) it is based on a pseudo-tree, which is a widely used graph structure in COP algorithms, and (iv) the complexity of this algorithm is determined by the induced width of problem instances. In the evaluations, we use an existing model for representing a utility function, and show empirically the effectiveness of our algorithm. Furthermore, we propose an extension of MO-IA, which finds several Pareto solutions so that we can provide a narrower region, in which Pareto front may exist, i.e., our extended algorithm can provide the more detailed information for Pareto front.
Many real-world complex systems can be modeled as networks, and most of them exhibit community structures. Community detection from networks is one of the important topics in link mining. In order to evaluate the goodness of detected communities, Newman modularity is widely used. In real world, however, many complex systems can be modeled as signed networks composed of positive and negative edges. Community detection from signed networks is not an easy task, because the conventional detection methods for normal networks cannot be applied directly. In this paper, we extend Newman modularity for signed networks. We also propose a method for optimizing our modularity, which is an efficient hierarchical agglomeration algorithm for detecting communities from signed networks. Our method enables us to detect communities from large scale real-world signed networks which represent relationship between users on websites such as Wikipedia, Slashdot and Epinions.
In this paper, we propose a new learning model for decentralized autonomous smart grid involving adaptive trading agents which can sell and buy electric power effectively in a local electric power network. We name the electric power network i-Rene (inter intelligent renewable energy network). The trading agents manage the amount of electric power generated by solar panels or other renewable energies by trading electric power stored in a storage battery in a house. The agent learns a trading strategy by maximizing its utility. Based on the proposed system, we evaluated its price formation and effectiveness of the adaptive trading method through simulations. Additionally, we propose a new variable consumption model for decentralized autonomous smart grid involving living people consuming electric power and the adaptive trading agents. To model demand side management which can control the amount of electric power consumption, developing variable consumption model is essential. We added a variable consumption model to the i-Rene model. We evaluated its price formation and effectiveness of the decentralized autonomous smart grid to equalize fluctuating demand.