Now that billions of people carry sensor-enabled mobile devices (e.g., smartphones), employing powerful capability of such commercial mobile products has become a promising approach for large-scale environmental and human-behavioral sensing. Such a new paradigm of scalable context monitoring is known as opportunistic sensing, and has been successfully applied to a broad range of applications. In this paper, we briefly introduce basic architecture and building blocks on which these emerging systems are based, and then provide a survey of recent progress in the opportunistic sensing technology.
To provide event-driven services in IoT, scalable methods of topic-based pub/sub messaging are indispensable. Methods using structured overlay networks are promising candidates. However, existing methods have the problem of wasting network resources, because they lack adaptivity to “exhaust data, ” which have low or no value most of the time. The problem contains two aspects. One is that each publisher node continues to forward data to a relay node even if there are no subscribers. The other is that excessively large multicast trees are constructed for low value data, which will be received by only a small number of subscribers. In this paper, we formulate the desirable design of overlay networks by defining a property called “strong relay-free” as an expansion of relay-free property. The property involves publishers and subscribers composing connected subgraphs to enable detecting the absence of subscribers and autonomously adjusting the tree size. We also propose a practical method satisfying the property by using Skip Graph, and evaluate it through simulation experiments. We confirmed that the proposed method can suspend publishing adaptively, and shorten the path length on multicast trees by more than 75% under an experimental condition with 100,000 nodes.
The Dalvik virtual machine (Dalvik VM) is an essential piece of software that runs applications on the Android operating system. Android application programs are commonly written in the Java language and compiled to Java bytecode. The Java bytecode is converted to Dalvik bytecode (Dalvik Executable file) which is interpreted by the Dalvik VM on typical Android devices. The significant disadvantage of interpretation is a much slower speed of program execution compared to direct machine code execution on the host CPU. However, there are many techniques to improve the performance of Dalvik VM. A typical methodology is just-in-time compilation which converts frequently executed sequences of interpreted instruction to host machine code. Other methodologies include dedicated bytecode processors and architectural extension on existing processors. In this paper, we propose an alternative methodology, “Fetch & Decode Hardware Extension, ” to improve the performance of Dalvik VM. The Fetch & Decode Hardware Extension is a specially designed hardware component to fetch and decode Dalvik bytecode directly, while the core computations within the virtual registers are done by the optimized Dalvik bytecode software handler. The experimental results show the speed improvements on Arithmetic instructions, loop & conditional instructions and method invocation & return instructions, can be achieved up to 2.4x, 2.7x and 1.8x, respectively. The approximate size of the proposed hardware extension is 0.03mm2 (equivalent to 10.56Kgate) and consumes additional power of only 0.23mW. The stated results are obtained from logic synthesis using the TSMC 90nm technology @ 200MHz clock frequency.
Many dynamic malware analysis systems based on hypervisors have been proposed. Although they support malware analysis effectively, many of them have a shortcoming that permits the malware to easily recognize the virtualized hardware and change its execution to prevent analysis. We contend that this drawback can be mitigated using a hypervisor that virtualizes the minimum number of hardware accesses. This paper proposes a hypervisor-based mechanism that can function as a building block for dynamic malware analysis systems. The mechanism provides the facility for checkpointing and restoring a guest OS. It is designed for a parapass-through hypervisor, that is, a hypervisor that runs directly on the hardware and does not execute a host OS or an administrative guest OS. The advantage of using a parapass-through hypervisor is that it provides a virtual machine whose hardware configuration and behavior is similar to the underlying physical machine, and hence, it can be stealthier than other hypervisors. We extend the parapass-through hypervisor BitVisor with the proposed mechanism, and demonstrate that the resulting system can successfully checkpoint and restore the states of Linux and Windows OSes. We confirm that hypervisor detectors running on the system cannot identify the virtualized hardware, and determine that they are executing on a physical machine. We also confirm that the system imposes minimal overhead on the execution times of the benchmark programs.
Memcached has been widely accepted as a technology to improve the response speed of web servers by caching data on DRAMs in distributed servers. Because of its importance, the acceleration of memcached has been studied on various platforms. Among them, FPGA looks the most attractive platform to run memcached, and several research groups have tried to obtain a much higher performance than that of CPU out of it. The difficulty encountered there, however, is how to manage large-sized memory (gigabytes of DRAMs) from memcached hardware built in an FPGA. Some groups are trying to solve this problem by using an embedded CPU for memory allocation and another group is employing an SSD. Unlike other approaches that try to replace memcached itself on FPGAs, our approach augments the software memcached running on the host CPU by caching its data and some operations at the FPGA-equipped network interface card (NIC) mounted on the server. The locality of memcached data enables the FPGA NIC to have a fairly high hit rate with a smaller memory. In this paper, we describe the architecture of the proposed NIC cache, and evaluate the effectiveness with a standard key-value store (KVS) benchmarking tool. Our evaluation shows that our system is effective if the workload has temporal locality but does not handle workloads well without such a characteristic. We further propose methods to overcome this problem and evaluate them. As a result, we estimate that the latency improved by up to 3.5 times over software memcached running on a high performance CPU.
This new toolchain for accelerating application on CPU-FPGA platforms, called Courier-FPGA, extracts runtime information from a running target binary, and re-constructs the function call graph including input-output data. Then, it synthesizes hardware modules on the FPGA and makes software functions on CPU by using Pipeline Generator. The Pipeline Generator also builds a pipeline control program by using Intel Threading Building Block (Intel TBB) to run both hardware modules and software functions in parallel. Finally, Courier-FPGA's Function Off-loader dynamically replaces and off-loads the original functions in the binary by using the built pipeline. Courier-FPGA performs the off-loading without user intervention, source code tweaks or re-compilations of the binary. In our case studies, Courier-FPGA was used to accelerate a histogram-of-gradients (HOG) feature detection program on the Zynq platform. A series of functions were off-loaded, and the program was sped up 3.98 times by using the built pipeline.
Reverse k-nearest neighbor (RkNN) queries on road network distances require long processing times because most conventional algorithms require a k-nearest neighbor (kNN) search on every visited node. This causes a large number of node expansions; therefore, the processing time is drastically increased when data points are sparsely distributed. In this paper, we propose a fast RkNN search algorithm that runs using a simple materialized path view (SMPV). In addition, we adopt an incremental Euclidean restriction strategy for fast kNN queries, the main function in RkNN queries. The SMPV used in our proposed algorithm only constructs an individual partitioned subgraph; therefore, the amount of data is drastically reduced compared to conventional materialized path views (MPVs). According to our experimental results using real road network data, our proposed method achieved a processing time that was 100 times faster than conventional approaches when data points are sparsely distributed on a road network.
It is often useful to compute a lot of matrix exponentials in computer graphics (CG). The exponential of a matrix is used for the smooth deformation of 2D or 3D meshed CG objects. Hence, we need to compute a large number of the exponentials of 3 × 3 rotational matrices and 3 × 3 real symmetric matrices. For rotational matrices, Rodrigues' formula is known to compute their exponentials. We investigated the polynomial methods introduced by Moler and Van Loan to compute an exponential of 3 × 3 real symmetric matrices, and we introduce an algorithm for eigenvalues of 3 × 3 real symmetric matrices. We introduce a simple formula for the matrix exponential of a 3 × 3 real symmetric matrix using a formula introduced by Kaji et al. in 2013 and Viète’s Formula. Since our matrix exponential algorithm do not use eigenvectors, we are able to reduce the computational cost using a fast eigenvalue computation algorithm. Then, we incorporated our implementation into a shape deforming tool developed by Kaji et al. As a result, we achieved a notable performance improvement. In fact we show our algorithms for matrix exponentials is about 76% faster than a standard algorithm for given 3 × 3 real symmetric matrices. For the deformation of a CG model, our algorithm was about 19% faster than a standard algorithm.
A novel representation for nonlinear utility spaces is provided, by adopting a modular decomposition of the issues and the constraints. The idea is that constraint-based utility spaces are nonlinear with respect to issues, but linear with respect to the constraints. The result is a mapping from a utility space into an issue-constraint hypergraph. Exploring the utility space is therefore reduced to a message passing mechanism along the hyperedges by means of utility propagation. The optimal contracts are efficiently found using a variation of the Max-Sum algorithm. Particularly, we use a power-law heuristic that lowers the search cost when exploring the utility hypergraph. We experimentally evaluate the model using parameterized random nonlinear utility spaces, showing that it can handle a large family of complex utility spaces using several exploration strategies. The complexity of the generated utility spaces is evaluated using the information theoretic notion of entropy. The optimal search strategy allows a better scaling of the model for complex utility spaces.
Indexing plays an important role for storing and retrieving the data in Information Retrieval System (IRS). Inverted Index is the most frequently used indexing structure in IRS. In order to reduce the size of the index and retrieve the data efficiently, compression schemes are used, because the retrieval of compressed data is faster than uncompressed data. High speed compression schemes can improve the performance of IRS. In this paper, we have studied and analyzed various compression techniques for 32-bit integer sequences. The previously proposed compression schemes achieved either better compression rates or fast decoding, hence their decompression speed (disk access + decoding) might not be better. In this paper, we propose a new compression technique, called Optimal FastPFOR, based on FastPFOR. The proposed method uses better integer representation and storage structure for compressing inverted index to improve the decompression performance. We have used TREC data collection in our experiments and the results show that the proposed code could achieve better compression and decompression compared to FastPFORand other existing related compression techniques.
In this paper we propose a new approach based on text mining techniques for predicting student performance using LSA (latent semantic analysis) and K-means clustering methods. The present study uses free-style comments written by students after each lesson. Since the potentials of these comments can reflect student learning attitudes, understanding of subjects and difficulties of the lessons, they enable teachers to grasp the tendencies of student learning activities. To improve our basic approach using LSA and k-means, overlap and similarity measuring methods are proposed. We conducted experiments to validate our proposed methods. The experimental results reported a model of student academic performance predictors by analyzing their comments data as variables of predictors. Our proposed methods achieved an average 66.4% prediction accuracy after applying the k-means clustering method and those were 73.6% and 78.5% by adding the overlap method and the similarity measuring method, respectively.
The Hospitals/Residents problem is a many-to-one generalization of the well-known Stable Marriage problem. Its instance consists of a set of residents, a set of hospitals, each resident's preference list, each hospital's preference list, and each hospital's capacity (i.e., the number of available positions). It asks to find a stable matching between residents and hospitals. In this paper, we consider the problem of deciding, given residents' preference lists and a matching, whether there are hospitals' preference lists that make a given matching stable. We call this problem Stable Hospital's Preference List problem (SHPL). It is easy to see that there always exists a solution if we allow arbitrary preference lists of hospitals. Considering more suitable situations, we pose a restricted version, called k-SHPL, in which there are only k kinds of preference lists of hospitals. We show that 1-SHPL is solvable in polynomial time, while k-SHPL is NP-complete for any k such that 2 ≤ k ≤ n1-ε, where n is the number of residents and ε is any positive constant. We also present four heuristics algorithms (first-fit algorithms) for 2-SHPL. We implement these algorithms and present a computational study using random instances.
We propose an indexing method for hierarchical graphs using their eigenvalues in order to detect those that are substructures or superstructures of a hierarchical graph given as a query efficiently. The index construction and the query processing are based on a relation among three interlacing sequences of eigenvalues of hierarchical graphs. We also propose a matrix representation for a hierarchical graph. Hierarchical graphs are decomposed to improve the filtering effect of the index and reduce the computational cost of both the index construction and the query processing. We evaluate the effectiveness of the proposed method by experiments.
WMNs (Wireless Mesh Networks) using IEEE 802.11 have been deeply studied to extend the area of coverage of the Internet. A typical approach to implement this kind of WMNs is to use dynamic metrics (e.g., ETX) over link-state routing protocols (e.g., OLSR). Although studies have demonstrated clarified that the approach performs well, there is still room for improve. In this paper, we first point out that the dynamic metrics by nature cannot pursuit rapid change in link quality, which prevents routing protocols to choose the best forwarding paths at every moment. To complement this drawback of the dynamic metric, we propose a local switching mechanism of forwarding channels for multi-radio, multi-channel WMNs, which works in combination with dynamic metrics. Our evaluation showed that the proposed method improves throughput and stability of communications when it works with dynamic metrics.
To address the digital divide in developing countries, fixed wireless access (FWA) networks have the potential to quickly provide economical access over a wide area within a radius of tens of kilometers. The conventional synchronous variable-multiple collision avoidance (v-MCA) system, which is referred to as a non-precedence (NP) system, can be operated over a network of any size without the need to use frame-length restrictions. However, it has a potential drawback of rapid degradation of the throughput due to the intervention of the round trip time, which is proportional to the network length and upload bandwidth. The advanced synchronous v-MCA system incorporating total precedence (TP) transmission of frames provides high throughput regardless of network length and upload bandwidth. In this paper, after showing the medium access control mechanisms of the NP and TP systems, their theoretical calculation models are discussed in detail. Then, their system performances are evaluated and overlooked by comparing theoretical and simulated results. The TP system provides an ultimate maximum throughput performance regardless of network length and total upload bandwidth, while maintaining the low delay characteristics of a contention-based access scheme.
October 05, 2017 Due to the maintenance‚following linking services will not be available on Oct 18 from 10:00 to 19:00 (JST)(Oct 18‚ from 1:00 to 10:00(UTC)). We apologize for the inconvenience. a)reference linking b)cited-by linking c)linking to J-STAGE with JOI/OpenURL
May 18, 2016 We have released “J-STAGE BETA site”.
May 01, 2015 Please note the "spoofing mail" that pretends to be J-STAGE.