IPSJ Online Transactions
Online ISSN : 1882-6660
ISSN-L : 1882-6660
Volume 7
Showing 1-18 articles out of 18 articles from the selected issue
  • Lei Ma, Cyrille Artho, Hiroyuki Sato
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 1-13
    Published: 2014
    Released: January 08, 2014
    JOURNALS FREE ACCESS
    With today's importance of distributed applications, their verification and analysis are still challenging. They involve large combinational states, interactive network communications between peers, and concurrency. Although there are some dynamic analysis tools for analyzing the runtime behavior of a single-process application, they do not provide methods to analyze distributed applications as a whole, where multiple processes run simultaneously. Centralization is a general solution which transforms multi-process applications into a single-process one that can be directly analyzed by existing tools. In this paper, we improve the accuracy of centralization. Moreover, we extend it as a general framework for analyzing distributed applications with multiple versions. First, we formalize the version conflict problem and present a simple solution, and further propose an optimized solution to resolving class version conflicts during centralization. Our techniques enable sharing common code whenever possible while keeping the version space of each component application separate. Centralization issues like startup semantics and static field transformation are improved and discussed. We implement and apply our centralization tool to some network benchmarks. Experiments, where existing tools are used on the centralized application, prove the usefulness of our automatic centralization tool, showing that centralization enables these tools to analyze distributed applications with multiple versions.
    Download PDF (841K)
  • Kazuyuki Hara, Kentaro Katahira
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 14-19
    Published: 2014
    Released: January 08, 2014
    JOURNALS FREE ACCESS
    In on-line gradient descent learning, the local property of the derivative term of the output function can slowly converge. Improving the derivative term, such as by using the natural gradient, has been proposed for speeding up the convergence. Beside this sophisticated method, we propose an algorithm that replaces the derivative term with a constant and show that this greatly increases convergence speed when the learning step size is less than 2.7, which is near the optimal learning step size. The proposed algorithm is inspired by linear perceptron learning and can avoid locality of the derivative term. We derived the closed deterministic differential equations by using a statistical mechanics method and show the validity of theoretical results by comparing them with computer simulation solutions. In real problems, the optimum learning step size is not given in advance. Therefore, the learning step size must be small. The proposed method is useful in this case.
    Download PDF (540K)
  • Satoru Tokuda, Kenji Nagata, Masato Okada
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 20-26
    Published: 2014
    Released: January 08, 2014
    JOURNALS FREE ACCESS
    The radial basis function (RBF) network is a regression model that uses the sum of radial basis functions such as Gaussian functions. It has recently been widely applied to spectral deconvolution such as X-ray photoelectron spectroscopy data analysis, which enables us to estimate the electronic state of matter from the spectral peak positions. For models with a hierarchy such as the RBF network, Bayesian learning provides better generalization performance than the maximum likelihood estimation. In Bayesian learning, the learning coefficient is well-known as the coefficients of the leading terms for the asymptotic expansion of generalization error and stochastic complexity. However, these coefficients have not been clarified in most models. We propose here a novel method for calculating the learning coefficient by using the exchange Monte Carlo method. In addition, we calculated the learning coefficient in the RBF networks and verified the efficiency of the proposed method by comparing theoretical and experimental values.
    Download PDF (433K)
  • Ryota Miyata, Toru Aonishi, Jun Tsuzurugi, Koji Kurata
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 27-32
    Published: 2014
    Released: January 08, 2014
    JOURNALS FREE ACCESS
    Many associative memory models with synaptic decay such as the forgetting model and the zero-order decay model have been proposed and studied so far. The previous studies showed the relation between the storage capacity C and the synaptic decay coefficient α in each synaptic decay model. However, with the exceptions of a few studies, they did not compare the network retrieval performance between different synaptic decay models. We formulate the associative memory model with the β-th-order synaptic decay as an extension of the zero-order decay model. The parameter β denotes the synaptic decay order or the degree of the synaptic decay term, which enables us to compare the retrieval performance between different synaptic decay models. Using numerical simulations, we investigate the relation between the synaptic decay coefficient α and the storage capacity C of the network by varying the synaptic decay order β. The results show that the properties of the synaptic decay model are constant for a large decay order β. Moreover, we search the minimum β to avoid overloading and the optimal β to maximize the network retrieval performance. The minimum integer value of β to avoid overloading is -1. The optimal integer value of β to maximize the network retrieval performance is 1, i.e., the degree of the forgetting model, and the suboptimal integer β is 0, i.e., that of the zero-order synaptic decay model.
    Download PDF (411K)
  • Yu Liu, Kento Emoto, Kiminori Matsuzaki, Zhenjiang Hu
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 33-42
    Published: 2014
    Released: January 23, 2014
    JOURNALS FREE ACCESS
    MapReduce programming model attracts a lot of enthusiasm among both industry and academia, largely because it simplifies the implementations of many data parallel applications. In spite of the simplicity of the programming model, there are many applications that are hard to be implemented by MapReduce, due to their innate characters of computational dependency. In this paper we propose a new approach of using the programming pattern accumulate over MapReduce, to handle a large class of problems that cannot be simply divided into independent sub-computations. Using this accumulate pattern, many problems that have computational dependency can be easily expressed, and then the programs will be transformed to MapReduce programs executed on large clusters. Users without much knowledge of MapReduce can also easily write programs in a sequential manner but finally obtain efficient and scalable MapReduce programs. We describe the programming interface of our accumulate framework and explain how to transform a user-specified accumulate computation to an efficient MapReduce program. Our experiments and evaluations illustrate the usefulness and efficiency of the framework.
    Download PDF (821K)
  • Masaki Kasuya, Kenji Kono
    Type: Regular Papers
    Subject area: Security
    2014 Volume 7 Pages 43-51
    Published: 2014
    Released: March 28, 2014
    JOURNALS FREE ACCESS
    Fake antivirus (AV) software, a kind of malware, pretends to be a legitimate AV product and frightens computer users by showing fake security alerts, as if their computers were infected with malware. In addition, fake AV urges users to purchase a “commercial” version of the fake AV. In this paper, we search for an indicator that captures behavioral differences in legitimate AV and fake AV. The key insight behind our approach is that legitimate AV behaves differently in clean and infected environments, whereas fake AV behaves similarly in both environments, because it does not analyze malware in the infected environments. We have investigated three potential indicators, file access pattern, CPU usage, and memory usage, and found that memory usage is an effective indicator to distinguish legitimate AV from fake AV. In an experiment, this indicator identifies all fake AV samples (39 out of 39) as fake and all legitimate AV products (8 out of 8) as legitimate. It is impractical for fake AV to evade this indicator because to do so would require it to detect malware infections, just as legitimate AV does.
    Download PDF (1153K)
  • Shimpei Yotsukura, Toshiaki Omori, Kenji Nagata, Masato Okada
    Type: Regular Paper
    Subject area: Regular Papers
    2014 Volume 7 Pages 52-58
    Published: 2014
    Released: March 31, 2014
    JOURNALS FREE ACCESS
    The spike-triggered average (STA) and phase response curve characterize the response properties of single neurons. A recent theoretical study proposed a method to estimate the phase response curve by means of linear regression with Fourier basis functions. In this study, we propose a method to estimate the STA by means of sparse linear regression with Fourier and polynomial basis functions. In the proposed method, we use sparse estimation with L1 regularization to extract substantial basis functions for the STA. We show using simulated data that the proposed method achieves more accurate estimation of the STA than the simple trial average used in conventional method.
    Download PDF (558K)
  • Yutaka Matsuno
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 59-68
    Published: 2014
    Released: June 12, 2014
    JOURNALS FREE ACCESS
    Assurance cases are documented body of evidence that provide valid and convincing argument that the system is adequately dependable in a given application and an environment. Assurance cases are widely required as a regulation for safety-critical systems in EU. There have been several graphical notations for assurance cases. GSN (Goal Structuring Notation) and CAE (Claim, Argument, Evidence) are such two notations. However, these notations have not been defined in a formal way. This paper presents a formal definition of GSN and its pattern extensions. We take the framework of functional programming language as the basis of our study. The implementation has been done on an Eclipse based GSN editor. We report case studies on previous works about GSN and show the applicability of the design and implementation. This is a step toward developing an assurance case language.
    Download PDF (1721K)
  • Kosetsu Ikeda, Nobutaka Suzuki
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 69-81
    Published: 2014
    Released: July 02, 2014
    JOURNALS FREE ACCESS
    Suppose that we have a DTD and XML documents valid against the DTD, and consider writing an XPath query to the documents. Unfortunately, a user often does not understand the entire structure of the documents exactly, especially in the case where the documents are very large and/or complex, or the DTD has been updated but the user misses it. In such cases, the user tends to write an invalid XPath query. However, it is difficult for the user to correct the query by hand due to his/her lack of exact knowledge about the entire structure of the documents. In this paper, we propose an algorithm that finds, for an XPath query q, a DTD D, and a positive integer K, top-K XPath queries most syntactically close to q among the XPath queries conforming to D, so that a user select an appropriate query among the K queries. We also present some experimental studies.
    Download PDF (983K)
  • Satoshi Sugiyama, Yasuhiko Minamide
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 82-92
    Published: 2014
    Released: July 16, 2014
    JOURNALS FREE ACCESS
    Most implementations of regular expression matching in programming languages are based on backtracking. With this implementation strategy, matching may not be achieved in linear time with respect to the length of the input. In the worst case, it may take exponential time. In this paper, we propose a method of checking whether or not regular expression matching runs in linear time. We construct a top-down tree transducer with regular lookahead that translates the input string into a tree corresponding to the execution steps of matching based on backtracking. The regular expression matching then runs in linear time if the tree transducer is of linear size increase. To check this property of the tree transducer, we apply a result of Engelfriet and Maneth. We implemented the method in OCaml and conducted experiments that checked the time linearity of regular expressions appearing in several popular PHP programs. Our implementation showed that 47 of 393 regular expressions were not linear.
    Download PDF (292K)
  • Shingo Okuno, Tasuku Hiraishi, Hiroshi Nakashima, Masahiro Yasugi, Jun ...
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 93-110
    Published: 2014
    Released: July 16, 2014
    JOURNALS FREE ACCESS
    This paper proposes a parallel algorithm to extract all connected subgraphs, each of which shares a common itemset whose size is not less than a given threshold, from a given graph in which each vertex is associated to an itemset. We also propose implementations of this algorithm using the task-parallel language Tascell. This kind of graph mining can be applied to analysis of social or biological networks. We have already proposed an efficient sequential search algorithm called COPINE for this problem. COPINE reduces the search space of a dynamically growing tree structure by pruning its branches corresponding to the following subgraphs; already visited, having itemsets smaller than the threshold, and having already-visited supergraphs with identical itemsets. For the third pruning, we use a table associating already-visited subgraphs and their itemsets. To avoid excess pruning in a parallel search where a unique set of subtrees (tasks) is assigned to each worker, we should put a certain restriction on a worker when it is referring to a table entry registered by another worker. We designed a parallel algorithm as an extension of COPINE by introducing this restriction. A problem of the implementation is how workers efficiently share the table entries so that a worker can safely use as many entries registered by other workers as possible. We implemented two sharing methods: (1) a victim worker makes a copy of its own table and passes it to a thief worker when the victim spawns a task by dividing its task and assigns it to the thief, and (2) a single table controlled by locks is shared among workers. We evaluated these implementations using a real protein network. As a result, the single table implementation achieved a speedup of approximately a factor four with 16 workers.
    Download PDF (1580K)
  • Takayuki Kawamura, Kiminori Matsuzaki
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 111-121
    Published: 2014
    Released: July 16, 2014
    JOURNALS FREE ACCESS
    Tree data such as XML trees have recently been getting larger and larger. Parallel and distributed processing is a promising way of dealing with big data, but we need to divide the data in the first step. Since computation over trees often requires relationships between parents and children and/or among siblings, we should pay attention to such relationships. There is a technique called the “m-bridge” for dividing trees. We can easily compute m-bridges for trees of any shape. However, division with the m-bridge technique is sometimes unsatisfactory for shallow XML trees. We propose a method of tree division for XML trees in this study, in which we apply the m-bridge technique to a one-to-one corresponding binary tree. We implement the tree division algorithm using the Simple API for XML (SAX) Parser. An important feature of our algorithm is that we transform and divide XML trees in the order that the SAX parser reads the trees. We carried out experiments and discuss the properties of the tree division algorithm we propose. In addition, we discuss how we can use the divided trees with query examples.
    Download PDF (483K)
  • Takashi Nakada, Kazuya Okamoto, Toshiya Komoda, Shinobu Miwa, Yohei Sa ...
    Type: Regular Papers
    Subject area: Embedded System
    2014 Volume 7 Pages 122-131
    Published: 2014
    Released: August 21, 2014
    JOURNALS FREE ACCESS
    Shifting to multi-core designs is so pervasive a trend to overcome the power wall and it is a necessary move for embedded systems in our rapidly evolving information society. Meanwhile, the need to increase the battery life and reduce maintenance costs for such embedded systems is very critical. Therefore, a wide variety of power reduction techniques have been proposed and realized, including Clock Gating, DVFS and Power Gating. To maximize the effectiveness of these techniques, task scheduling is a key but for multi-core systems it is very complicated due to the huge exploration space. This problem is a major obstacle for further power reduction. To cope with it, we propose a design method for embedded systems to minimize their energy consumption under performance constraints. This method is based on the clarification of properties of the above mentioned low power techniques and their interactions. In more details, we firstly establish energy models for these low power techniques and our target systems. We then explore for the best configuration by constructing an optimization problem especially for applications which have a longer deadline than the execution interval. Finally, we propose an approximate solution using dynamic programming with a lower computation complexity and compare it to a brute force explicit solution. We confirm with our evaluations that the proposed method successfully found a better configuration which reduces the total energy consumption by 32% if compared to the manually optimized configuration, which utilizes only one core.
    Download PDF (1450K)
  • Hua Vy Le Thanh, Xin Li
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 132-138
    Published: 2014
    Released: September 01, 2014
    JOURNALS FREE ACCESS
    Pushdown systems (PDSs) are well-understood as abstract models of recursive sequential programs, and weighted pushdown systems (WPDSs) are a general framework for solving certain meet-over-all-path problems in program analysis. Conditional WPDSs (CWPDSs) further extend WPDSs to enhance the expressiveness of WPDSs, in which each transition is guarded by a regular language over the stack that specifies conditions under which a transition rule can be applied. CWPDSs or its instance are shown to have wide applications in analysis of objected-oriented programs, access rights analysis, etc. Model checking CWPDSs was shown to be reduced to model checking WPDSs, and an offline algorithm was given that translates CWPSs to WPDSs by synchronizing the underlying PDS and finite state automata accepting regular conditions. The translation, however, can cause an exponential blow-up of the system. This paper presents an on-the-fly model checking algorithm for CWPDSs that synchronizes the computing machineries on-demand while computing post-images of regular configurations. We developed an on-the-fly model checker for CWPDSs and apply it to models generated from the reachability analysis of the HTML5 parser specification. Our preliminary experiments show that, the on-the-fly algorithm drastically outperforms the offline algorithm regarding both practical space and time efficiency.
    Download PDF (329K)
  • Katsuya Kawanami, Noriyuki Fujimoto
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 139-147
    Published: 2014
    Released: November 28, 2014
    JOURNALS FREE ACCESS
    The longest common subsequence (LCS) for two given strings has various applications, such as for the comparison of deoxyribonucleic acid (DNA). In this paper, we propose a graphics processing unit (GPU) algorithm to accelerate Hirschberg's LCS algorithm improved with Crochemore et al.'s bit-parallel algorithm. Crochemore et al.'s algorithm includes bitwise logical operators, which can be computed easily in parallel because they have bitwise parallelism. However, Crochemore et al.'s algorithm also includes an operator with less parallelism, i.e., an arithmetic sum. In this paper, we focus on how to implement these operators efficiently in parallel and experimentally show the following results. First, the proposed GPU algorithm with a 2.67GHz Intel Core i7 920 CPU and GeForce GTX 580 GPU performs a maximum of 12.81 times faster than the bit-parallel CPU algorithm using a single-core 2.67GHz Intel Xeon X5550 CPU. Subsequently, the proposed GPU algorithm executes a maximum of 4.56 times faster than the bit-parallel CPU algorithm using a four-core 2.67GHz Intel Xeon X5550 CPU. Furthermore, the proposed algorithm with GeForce 8800 GTX performs 10.9 to 18.1 times faster than Kloetzli et al.'s existing GPU algorithm with the same GPU.
    Download PDF (817K)
  • Akimasa Morihata, Kiminori Matsuzaki
    Type: Regular Papers
    Subject area: Regular Paper
    2014 Volume 7 Pages 148-156
    Published: 2014
    Released: December 10, 2014
    JOURNALS FREE ACCESS
    Parallel tree contraction is a well established method of parallel tree processing. There are efficient and useful algorithms for binary trees, including the Shunt contraction algorithm and one based on the m-bridge decomposition method. However, for trees of unbounded degree, there are few practical tree contraction algorithms. The standard approach is “binarization, ” namely to translate the input tree to a full binary tree beforehand. To prevent the overhead introduced by binarization, we previously proposed the Rake-Shunt contraction algorithm (ICCS 2011), which is a generalization of the Shunt contraction algorithm to trees of unbounded degree. This paper further extends this result. The major contribution is to show that the Rake-Shunt contraction algorithm is a tree contraction algorithm that uses fewer types of primitive contraction operations if we assume the input tree has been binarized. This observation clarifies the connection between the Rake-Shunt contraction algorithm and those based on binarization. In particular, it enables us to translate a parallel program developed based on the Rake-Shunt contraction algorithm to one based on the m-bridge decomposition method. Thus, we can choose whether to use binarization according to the situation.
    Download PDF (375K)
  • Kenichi Kourai, Hisato Utsunomiya
    Type: Regular Papers
    Subject area: Virtualization
    2014 Volume 7 Pages 157-167
    Published: 2014
    Released: December 18, 2014
    JOURNALS FREE ACCESS
    Since Infrastructure-as-a-Service (IaaS) clouds contain many vulnerable virtual machines (VMs), intrusion detection systems (IDSes) should be run for all the VMs. IDS offloading is promising for this purpose because it allows IaaS providers to run IDSes outside of VMs without any cooperation of the users. However, offloaded IDSes cannot continue to monitor their target VM when the VM is migrated to another host. In this paper, we propose VMCoupler for enabling co-migration of offloaded IDSes and their target VM. Our approach is running offloaded IDSes in a special VM called a guard VM, which can monitor the internals of a target VM using VM introspection. VMCoupler can migrate a guard VM together with its target VM and restore the state of VM introspection at the destination. The migration processes of these two VMs are synchronized so that a target VM does not run without being monitored. We have confirmed that the overhead of monitoring and co-migration was small.
    Download PDF (1000K)
  • Akimasa Yoshida, Yuki Ochi, Nagatsugu Yamanouchi
    Type: Regular Papers
    Subject area: Compiler
    2014 Volume 7 Pages 168-178
    Published: 2014
    Released: December 18, 2014
    JOURNALS FREE ACCESS
    Multicore processors are widely used for various types of computers. In order to achieve high-performance on such multicore systems, it is necessary to extract coarse grain task parallelism from a target program in addition to loop parallelism. Regarding the development of parallel programs, Java or a Java-extension language represents an attractive choice recently, thanks to its performance improvement as well as its platform independence. Therefore, this paper proposes a parallel Java code generation scheme that realizes coarse grain task parallel processing with layer-unified execution control. In this parallel processing, coarse grain tasks of all layers are collectively managed through a dynamic scheduler. In addition, we have developed a prototype parallelizing compiler for Java programs with directives. In performance evaluations, the compiler-generated parallel Java code was confirmed to attain high performance. Concretely, we obtained 7.82 times faster speed-up for the Jacobi program, 7.38 times faster speed-up for the Turb3d program, 6.54 times faster speed-up for the Crypt program, and 6.15 times faster speed-up for the MolDyn program on eight cores of Xeon E5-2660.
    Download PDF (1703K)
feedback
Top