Text-based communication in an online-community obscures the characteristics of the participants that aid social interaction. In this paper, we propose a new method for profiling participants in an online-community to help the participants gain a better grasp of their social milieu, i.e., who are the other participant, what are their characteristics, and what are their roles. The proposed algorithm is based on Influence Diffusion Model (IDM), a method for discovering influential comments, opinion leaders, and interesting terms from threaded online discussions. We applied the proposed algorithm to eight electronic message boards, and confirmed higher precision and coverage values than other traditional keyword-based profiling methods.
Recently, there has been a growing interest in developing evolutionary algorithms based on probabilistic modeling. They are called probabilistic model-building genetic algorithms (PMBGAs) or estimation of distribution algorithms (EDAs). In this scheme, the offspring population is generated according to the estimated probability density model of the parent instead of using recombination and mutation operators. In this paper, we have proposed PMBGAs in permutation domains using edge histogram based sampling algorithms (EHBSAs). Two types of sampling algorithms, without template (EHBSA/WO) and with template (EHBSA/WT), are presented. The results were tested in the TSP and showed EHBSA/WT worked fairly well with a small population size in the test problems used. It also worked better than well-known traditional two-parent recombination operators.
This paper introduces an agent system for supporting user's dairy work on the Internet like a secretary. In this system, an agent is assigned to a user, and receives requests from the user or other agents. Since there are various kinds of requests, it is difficult to prepare a complete set of request-handling rules in advance. In order to handle various requests, the agent uses Case Based Reasoning (CBR), which is an approach to solve a problem by referring old cases. Requests and cases are described as XML documents, that are easily understandable for both the user and agent, because the agent needs to interact with them. Describing documents in XML enables an agent to match a request and a case more exactly. This agent consists of a request receipt module, planning module, executing module, and case storage module. The request receipt module receive a request from the user or other agents. The request is described as a XML document by interacting with the user. The planning module searches an old case similar to the request, and generates a sequence of basic operations as a plan by referring the case. The executing module executes the plan. If the agent fails to execute a basic operation or requires user's instruction, then it carries out the plan by interacting with the user. The case storage module stores the new case with user's evaluation score into the case base. The experimental results shows that increase in the number of cases raises a proposal rate and accuracy rate. However, too many cases may cause decline in the accuracy rate by inconsistency of user's evaluation.
For genetic algorithms, it is important to maintain the population diversity. Some genetic algorithms have been proposed, which have an ability to control the diversity. But these algorithms use the distance between two individuals to control the diversity. Therefore, these performances become worse on ill-scaled functions. In this paper, we propose a new genetic algorithm, DIDC(a genetic algorithm with Distance Independent Diversity Control), that does not use a distance to control the population diversity. For controlling the diversity, DIDC uses two GAs that have different natures. For realizing different natures, one GA uses a crossover operator as a search operator, and the other GA uses a mutation operator in DIDC. By applying DIDC to several benchmark problems, we show that DIDC has a good performance on high dimensional, multimodal, non-separable and ill-scaled problems. Finally, we show that the control parameter of DIDC has the same effect on the search with the number of generating children nc.
We develop a browsing support system which learns user's interests and highlights keywords based on a user's browsing history. Monitoring the user's access to the Web enables us to detect ``familiar words'' for the user. We extract keywords at the current page, which are relevant to the familiar words, and highlight them. The relevancy is measured by the biases of co-occurrence, called IRM (Interest Relevance Measure). Our system consists of three components; a proxy server which monitors access to the Web, a frequency server which stores frequency of words in the accessed Web pages, and a keyword extraction module. We show the effectiveness of our system by experiments.
This paper introduces a support system for making presentation slides from a technical paper. This system provides functions that assign slides to each section and put objects on a slide. Inputs to this system are a technical paper as a TeX document, the number of slides that a user wants to make, and keywords of the paper. First, the system converts a paper from a TeX document into an XML document. The XML document can include information of a paper such as ID numbers and term weights. Next, the system calculates weights of terms in the document by the TF*IDF method. Based on the term weights, objects in the document such as sentences, figures and tables are weighted. Using the weights of the objects and slide composition templates, the system decides how many slides are assigned to each section. If a user does not like the assignment, she/he can reassign slides to the section using a presentation composition editor. Then, the system selects a layout for a slide considering the objects in the slide, and extracts objects arranged on the slide. The user can rearrange the objects on the slide using a slide editor. Finally, outputs of the system are generated as presentation slides in XHTML. From experimental results, we concluded our system is useful for making presentation slides.
A method of generating synonyms for a stimulus word using a computer is proposed. Vector Space Model, where words in text data are arranged in a multi-dimensional space and degree of similarity between two words of them is calculated from how close the words are in the space, may be available to the method. However, it is not easy to optimize parameters in the method because there is no appropriate standard synonym database where proper synonyms for a stimulus word are thoroughly collected. Therefore, we first built such a standard database employing two steps of human subjects expriments, and optimized the parameters of the method of generating synonyms. As the result, it was found that the Vector Space Model-based method using an electronic dictionary as source is better to generate synonyms than the one using a text corpus and an ordinal method using a thesaurus.