Neuromorphic vision algorithms are biologically-inspired computational models of the primate visual pathway. They promise robustness, high accuracy, and high energy efficiency in advanced image processing applications. Despite these potential benefits, the realization of neuromorphic algorithms typically exhibit low performance even when executed on multi-core CPU and GPU platforms. This is due to the disparity in the computational modalities prominent in these algorithms and those modalities most exploited in contemporary computer architectures. In essence, acceleration of neuromorphic algorithms requires adherence to specific computational and communicational requirements. This paper discusses these requirements and proposes a framework for mapping neuromorphic vision applications on a System-on-Chip, SoC. A neuromorphic object detection and recognition on a multi-FPGA platform is presented with performance and power efficiency comparisons to CMP and GPU implementations.
As battery runtime and overheating problems for portable devices become unignorable, energy-aware LSI design is strongly required. Moreover, an interconnection delay should be explicitly considered there because it exceeds a gate delay as the semiconductor devices are downsized. We must take account of energy efficiency and interconnection delays even in high-level synthesis. In this paper, we first propose a huddle-based distributed-register architecture (HDR architecture), an island-based distributed-register architecture for multi-cycle interconnect communications where we can develop several energy-saving techniques. Next, we propose an energy-efficient high-level synthesis algorithm for HDR architectures focusing on multiple supply voltages. Our algorithm is based on iterative improvement of scheduling/binding and floorplanning. In the iteration process, a huddle, which is composed of functional units, registers, controller, and level converters, are very naturally generated using floorplanning results. By assigning high supply voltage to critical huddles and low supply voltage to non-critical huddles, we can finally have energy-efficient floorplan-aware high-level synthesis. Experimental results show that our algorithm achieves 45% energy-saving compared with the conventional distributed-register architectures and conventional algorithms.
Inter-processor communication and synchronization are critical problems in embedded multiprocessors. In order to achieve high-speed communication and low-latency synchronization, most recent designs employ dedicated hardware engines to support these communication protocols individually, which is complex, inflexible, and error prone. Thus, this paper motivates the optimization of inter-processor communication and synchronization by using application-specific instruction-set processor (ASIP) techniques. The proposed communication mechanism is based on a set of custom instructions coupled with a low-latency on-chip network, which provides efficient support for both data transfer and process synchronization. By using state-of-the-art ASIP design methodology, we embed the communication functionalities into a base processor, making the proposed mechanism feature ultra low overhead. More importantly, industry-standard compatible programming interfaces supporting both message-passing and shared-memory paradigms are exposed to end-users to ease the software porting. Experimental results show that the bandwidth of the proposed message-passing protocol can achieve up to 703Mbyte/s @ 200MHz, and the latency of the proposed synchronization protocol can be reduced by more than 81% when compared with the conventional approach. Moreover, as a case study, we also show the effectiveness of the proposed communication mechanism in a real-life embedded application, WiMedia UWB MAC.
Reliability issues, such as soft errors, process variations and Negative Bias Temperature Instability (NBTI), become dominant on Field Programmable Gate Arrays (FPGAs) fabricated in a nanometer process. We focus on aging degradation by NBTI, which causes threshold voltage shifts on PMOS transistors. We characterize delay degradation in the routing structures on FPGAs. The rising and falling delays vary due to NBTI and heavily depend on circuit configurations. In the independent routing switch, the delay fluctuation due to NBTI can be minimized by transistor sizing. The falling delay does not change after 10-years degradation. In the routing structures composed of the routing switches and wires, the delay fluctuation depends on the wire length and can be minimized to optimize the wire length. We also show that the signal flipping can reduce the delay degradation from 11.3% to 2.76% on the routing resources.
A static analysis tool has been developed for finding common mistakes in programs that use the Java Native Interface (JNI). Specific rules in JNI are not caught by C++ and other compilers, and this tool is aimed at rules about references to Java objects, which are passed to native methods as local references. Local references become invalid when the native method returns. To keep them valid after the return, the programmer should convert them into global references. If they are not converted, the garbage collector may malfunction and may, for example, fail to mark referenced objects. The developed static analysis tool finds assignments of local references to locations other than local variables such as global variables and structure fields. The tool was implemented as a plug-in for Clang, a compiler front-end for the LLVM. Application of this tool to native Android code demonstrated its effectiveness.
Server memory size is continuously growing. To accelerate business data processing of Java-based enterprise systems, the use of large memory is required. One example of such use is the object cache for accelerating read access of a large volume of data. However, Java incorporates a garbage collection (GC) mechanism for reclaiming unused objects. A typical GC algorithm requires finding references from old objects to young objects for identifying unused objects. This means that enlarging the heap memory increases the time for finding references. We propose a GC time reduction algorithm for large object cache systems, which eliminates the need for finding the references from a specific object cache region. This algorithm premises to treat cached objects as immutable objects that only allow READ and REMOVE operations. It also divides the object cache in two regions. The first is a closed region, which contains only immutable objects. The other is an unclosed region, which contains mutable objects that have survived GC. Filling an unclosed region changes the region to closed. When modifying an immutable object, the object is copied to the unclosed region and the modification is applied to the copied object. This restricts references from the object cache region to the region for young objects and excludes changes to the objects in the closed region. Experimental evaluation showed that the proposed algorithm can reduce GC time by 1/4 and improve throughput by 40% compared to traditional generational GC algorithms.
Finding code clones in the open source systems is important for efficient and safe reuse of existing open source software. In this paper, we propose a novel search model, open code clone search, to explore code clones in open source repositories on the Internet. Based on this search model, we have designed and implemented a prototype system named OpenCCFinder. This system takes a query code fragment as its input, and returns the code fragments containing the code clones with the query. It utilizes publicly available code search engines as external resources. Using OpenCCFinder, we have conducted several case studies for Java code. These case studies show the applicability of our system.
Debugging failing test cases, particularly the search for failure causes, is often a laborious and time-consuming activity. With the help of spectrum-based fault localization developers are able to reduce the potentially large search space by detecting anomalies in tested program entities. However, such anomalies do not necessarily indicate defects and so developers still have to analyze numerous candidates one by one until they find the failure cause. This procedure is inefficient since it does not take into account how suspicious entities relate to each other, whether another developer is better qualified for debugging this failure, or how erroneous behavior comes to be. We present test-driven fault navigation as an interconnected debugging guide that integrates spectrum-based anomalies and failure causes. By analyzing failure-reproducing test cases, we reveal suspicious system parts, developers most qualified for addressing localized faults, and erroneous behavior in the execution history. The Paths tool suite realizes our approach: PathMap supports a breadth first search for narrowing down failure causes and recommends developers for help; PathFinder is a lightweight back-in-time debugger that classifies failing test behavior for easily following infection chains back to defects. The evaluation of our approach illustrates the improvements for debugging test cases, the high accuracy of recommended developers, and the fast response times of our corresponding tool suite.
In object-oriented programs, access modifiers are used to control the accessibility of fields and methods from other objects. Choosing appropriate access modifiers is one of the key factors for easily maintainable programming. In this paper, we propose a novel analysis method named Accessibility Excessiveness (AE) for each field and method in Java program, which is discrepancy between the access modifier declaration and its real usage. We have developed an AE analyzer - ModiChecker which analyzes each field or method of the input Java programs, and reports the excessiveness. We have applied ModiChecker to various Java programs, including several OSS, and have found that this tool is very useful to detect fields and methods with the excessive access modifiers.
This paper proposes a monitoring method for information retrieval from the sources in large-scale networks, which tries to achieve the maximum gain of user utility with the minimum source observation cost. Generally, information accumulated in a network is being updated every second and the contents which have been downloaded by network users become obsolete as time passes. User utility, which is obtained by the users when they get their target information successfully, declines with the elapsed time from the instant of placing a retrieval request or the content's renewal. Accordingly, the proposed monitoring method adjusts its observation intervals according to the information sources' update intervals considering user utility decrease and monitoring cost caused by the observation of sources. The usefulness of the proposal is confirmed by computer simulations, which illustrate that it is effective especially in the condition that the expense of the observations is neither extremely emphasized nor neglected. In addition, a prototype of the monitoring system which implemented the proposal is developed and some monitoring experiments in the real Internet environment are conducted. The results show that the proposed method seems to be available in the case where the conditions are close to those of the simulations.
Infrastructure as a Service (IaaS) provides virtual machines (VMs) to the users and its system administrators often manage the user VMs using privileged VMs called the management VM. However, the administrators are not always trustworthy from users' point of view. If the administrators allow outside attackers to intrude in the management VM, the attackers can easily steal sensitive information from user VMs' memory. In this paper, we propose VMCrypt, which preserves the data secrecy of VMs' memory using the trusted virtual machine monitor. VMCrypt provides a dual memory view: a normal view for a user VM and an encrypted view for the management VM. The encrypted view prevents sensitive information from leaking to the management VM. To support the existing management software for para-virtualization, VMCrypt exceptionally provides a normal view to the management VM only for several memory regions, which are automatically identified and maintained during the life cycle of a user VM. We have implemented VMCrypt in Xen and our experimental results show that the downtime due to live migration was still less than one second.
We present a game semantics for an Algol-like language with shared variable parallelism. On contrary to deterministic sequential programs, whose semantics can be characterized by observing termination behaviors, it is crucial for parallel programs to observe not only termination but also divergence, because of nondeterministic scheduling of parallel processes. In order to give a more appropriate foundation for modeling parallelism, we base our development on Harmer's game semantics, which concerns not only may-convergence but also must-convergence for a nondeterministic programming language EIA. The game semantics for the Algol-like parallel language is shown to be fully abstract, which indicates that the parallel command of our Algol-like language adds no extra power than nondeterminism provided by EIA. We also sketch how the equivalence of two parallel programs can be reasoned about based on the game semantical interpretation.
Graph clustering is a long-standing problem in data mining and machine learning. Traditional graph clustering aims to partition a graph into several densely connected components. However, with the proliferation of rich attribute information available for objects in real-world graphs, vertices in graphs are often associated with a number of attributes that describe the properties of the vertices. This gives rise to a new type of graphs, namely attributed graphs. Thus, how to leverage structural and attribute information becomes a new challenge for attributed graph clustering. In this paper, we introduce the state-of-the-art studies on clustering large attributed graphs. These methods propose different approaches to leverage both structural and attribute information. The resulting clusters will have both cohesive intra-cluster structures and homogeneous attribute values.
We propose a length-preserving enciphering scheme that achieves PRP security and streamable decryption. No enciphering scheme satisfying these properties is known. Our enciphering scheme is suitable for secure communication on narrowband channels and memory-constrained devices. Although length-preserving enciphering schemes satisfying the SPRP security, which is stronger than the PRP security, are known, it is impossible to support the SPRP security and the streamability at the same time. Namely, the memory to store an entire plaintext/ciphertext is required. When the decryption is performed with memory-constrained devices, the PRP security is the strongest concept of achievable security.
Machine translation of patent documents is very important from a practical point of view. One of the key technologies for improving machine translation quality is the utilization of syntax. It is difficult to select the appropriate parser for English to Japanese patent machine translation because the effects of each parser on patent translation are not clear. This paper provides an empirical comparative evaluation of several state-of-the-art parsers for English, focusing on the effects on patent machine translation from English to Japanese. We add syntax to a method that constrains the reordering of noun phrases for phrase-based statistical machine translation. There are two methods for obtaining the noun phrases from input sentences: 1) an input sentence is directly parsed by a parser and 2) noun phrases from an input sentence are determined by a method using the parsing results of the context document that contains the input sentence. We measured how much each parser contributed to improving the translation quality for each of the two methods and how much a combination of parsers contributed to improving the translation quality for the second method. We conducted experiments using the NTCIR-8 patent translation task dataset. Most of the parsers improved translation quality. Combinations of parsers using the method based on context documents achieved the best translation quality.
Nearest neighbor search (NNS) among large-scale and high-dimensional vectors has played an important role in recent large-scale multimedia search applications. This paper proposes an optimized codebook construction algorithm for approximate NNS based on product quantization. The proposed algorithm iteratively optimizes both codebooks for product quantization and an assignment table that indicates the optimal codebook in product quantization. In experiments, the proposed method is shown to achieve better accuracy in approximate NNS than the conventional method with the same memory requirement and the same computational cost. Furthermore, use of a larger number of codebooks increases the accuracy of approximate NNS at the expense of a slight increase in the memory requirement.
For the detection of generic objects in the field of image processing, histograms of orientation gradients (HOG) is discussed for these years. The performance of the classification system using HOG shows a good result. However, the performance of using HOG descriptor would be influenced by the detecting object size. In order to overcome this problem, we introduce a kind of hierarchy inspired from the convolution-net, which is a model of our visual processing system in the brain. The hierarchical HOG (H-HOG) integrates several scales of HOG descriptors in its architecture, and represents the input image as the combinatorial of more complex features rather than that of the orientation gradients. We investigate the H-HOG performance and compare with the conventional HOG. In the result, we obtain the better performance rather than the conventional HOG. Especially the size of representation dimension is much smaller than the conventional HOG without reducing the detecting performance.
We introduce a word-based dependency parser for Japanese that can be trained from partially annotated corpora, allowing for effective use of available linguistic resources and reduction of the costs of preparing new training data. This is especially important for domain adaptation in a real-world situation. We use a pointwise approach where each edge in the dependency tree for a sentence is estimated independently. Experiments on Japanese dependency parsing show that this approach allows for rapid training and achieves accuracy comparable to state-of-the-art dependency parsers trained on fully annotated data.
We analyzed a light-field super-resolution problem in which the 3-D scene is reconstructed with a higher resolution using super-resolution (SR) reconstruction with a given set of multi-view images with a low resolution. The arrangement of the multi-view cameras is important because it determines the quality of the reconstruction. To simplify the analysis, we considered a situation in which a plane is located at a certain depth and a texture on that plane is super-resolved. We formulated the SR reconstruction process in the frequency domain, where the camera arrangement can be independently expressed as a matrix in the image formation model. We then evaluated the condition number of the matrix to quantify the quality of the SR reconstruction. We clarified that when the cameras are arranged in a regular grid, there exist singular depths in which the SR reconstruction becomes ill-posed. We also determined that this singularity can be avoided if the arrangement is randomly perturbed.
We present a method for synthesizing high-quality free-viewpoint images from a set of multi-view images. First, an accurate depth map is estimated from a given target viewpoint using modified semi-global stereo matching. Then, a high-resolution image from that viewpoint is obtained through super-resolution (SR) reconstruction. The depth estimation results from the first step are used for the second step in two ways. First, the depth values are used to associate pixels between the input images and the latent high-resolution image. Second, the pixel-wise reliabilities of the depth information are used for regularization to adaptively control the strength of the SR reconstruction. Extensive experimental results using real images show the effectiveness of our method.
In this paper we propose a novel method that performs 3D face reconstruction, and non-constrained and non-contact gaze estimation on a moving object, whose head-pose can freely change, from multi-view video. The main idea is to first reconstruct the 3D face with high accuracy using symmetry prior. Then we generate a super-resolution virtual frontal face video from the estimated 3D face geometry and the original multi-view video. Finally a 3D eyeball model is introduced to estimate the three-dimensional gaze direction from the virtual frontal face video. Experiments with real data illustrate the effectiveness of our method.
One of promising approach to reconstruct a 3D shape is a projector-camera system that projects structured light pattern. One of the problem of this approach is that it has difficulty to obtain texture simultaneously because the texture is interfered by the illumination by the projector. The system proposed in this paper overcomes this issue by separating the light wavelength for texture and shape. The pattern is projected by using infrared light and the texture is captured by using visible light. If the cameras for infrared and visible lights are placed at different position, it causes the misalignment between texture and shape, which degrades the quality of textured 3D model. Therefore, we developed a multi-band camera that acquires both visible and infrared lights from a single viewpoint. Moreover, to reconstruct a 3D shape using multiple wavelengths of light, namely multiple colors, an infrared pattern projector is developed to generate a multi-band grid pattern. Additionally, a simple method to calibrate the system is proposed by using the fixed grid pattern. Finally, we show the textured 3D shapes captured by the experimental system.
Graph similarity search is to retrieve graphs that approximately contain a given query graph. It has many applications, e.g., detecting similar functions among chemical compounds. The problem is challenging as even testing subgraph containment between two graphs is NP-complete. Hence, existing techniques adopt the filtering-and-verification framework with the focus on developing effective and efficient techniques to remove non-promising graphs. Nevertheless, existing filtering techniques may be still unable to effectively remove many “low” quality candidates. To resolve this, in this paper we propose a novel indexing technique to index graphs according to their “distances” to features. We then develop lower and upper bounding techniques that exploit the index to (1) prune non-promising graphs and (2) include graphs whose similarities are guaranteed to exceed the given similarity threshold. Considering that the verification phase is not well studied and plays the dominant role in the whole process, we devise efficient algorithms to verify candidates. A comprehensive experiment using real datasets demonstrates that our proposed methods significantly outperform existing methods.
Due to the proliferation of location-based information services, there is abundant urban information which makes us difficult to catch up with the characteristics and dynamics of our living space. However, nowadays, crowd lifelogs shared over social network sites are attracting a great deal of attention as a novel source to search for local information from the massive voices and lifelogs of crowds. In this regard, we can further look into urban images representing how we recognize a city in mind through the direct massive crowd experiences. In this work, we explore crowd-experienced local information over location-based social network sites to derive much better understandable and useful urban images. In detail, we propose a method to generate a socio-cognitive map where characteristic urban clusters are projected based on cognitive distance between urban areas. Specifically, in order to measure cognitive distances between urban clusters and examine their influential strengths, we observe crowd's movements over Twitter. Finally, we show an experimental result of generating a socio-cognitive map illustrating crowd-sourced cognitive relations between urban clusters in Kinki area, Japan.
Recent malware communicate with remote hosts in the Internet for receiving C&C commands and updating themselves, etc., and their behaviors can be diverse depending on the behaviors of the remote hosts. Thus, when analyzing these malware by sandbox analysis, it is important not only to focus behaviors of a malware sample itself but also those of the remote servers that are controlled by attackers. A simple solution to achieve this is to observe the live sample by an Internet-connected sandbox for a long period of time. However, since we do not know when these servers will send meaningful responses, we need to keep the sample being executed in the sandbox, which is indeed a costly operation. Also, leaving the live malware in the Internet-connected sandbox increases the risk that its attacks spill out of the sandbox and induce secondary infections. In this paper, we propose a novel sandbox analysis method using a dummy client, an automatically generated lightweight script to interact with the remote servers instead of the malware sample itself. In the proposed method, at first we execute a malware sample in the sandbox that is connected to the real Internet and Internet Emulator. Secondly, we inspect the traffic observed in the sandbox and filter out high-risk communications. The rest of the traffic data is then used by the dummy client to interact with the remote servers instead of the sample itself and effectively collects the responses from the servers. The collected server responses are then fed back to the Internet Emulator in the sandbox and will be used for improving observability of malware sandbox analysis. In the experiment with malware samples captured in the wild, we indeed observed a considerable number of changes in the responses from the remote servers that were obtained by our dummy client. Also, in comparison with the simple Internet-connected sandbox, the proposed sandbox could improve observability of malware sandbox analysis.
In Vehicular Ad Hoc Networks (VANETs), the vehicular scenario requires smart signaling, smart road maintenance and other services. A brand new security issue is that the semi-trusted Road Side Units (RSUs) may be compromised. In this paper, we propose an Elliptic curve ElGamal Threshold system-based key management scheme for safeguarding a VANET from RSUs being compromised and their collusion with malicious vehicles. We analyze the packet loss tolerance for security performance demonstration, followed by a discussion on the threshold. After discussion of the feasibility on privacy and processing time, overhead analysis is presented in terms of two types of application scenarios: Emergency Braking Notification (EBN) and Decentralized Floating Car Data (DFCD). Our method can promote security with low overhead in EBN and does not increase overhead in DFCD during security promotion.
It is becoming more and more important to make use of personal or classified information while keeping it confidential. A promising tool for meeting this challenge is secure multi-party computation (MPC). However, one of the biggest problems with MPC is that it requires a vast amount of communication. We analyzed existing MPC protocols and found that the random number bitwise-sharing protocol used by many of them is notably inefficient. By devising a representation of the truth values and using special form prime numbers, we propose efficient random number bitwise-sharing protocols, dubbed “Extended-Range I and II,” which reduce the communication complexity to approximately 1/6th that of the best of the existing such protocol. We reduced the communication complexity to approximately 1/26th by reducing the abort probability, thereby making previously necessary backup computation unnecessary. Using our improved protocol, “Lightweight Extended-Range II,” we reduced the communication complexities of equality testing, comparison, interval testing, and bit-decomposition, all of which use the random number bitwise-sharing protocol, by approximately 91, 79, 67, and 23% (for 32-bit data), respectively. We also reduce the communication complexity of private exponentiation by about 70% (for 32-bit data and five parties).
Many of society's systems are dependent on information technology (IT), which means that securing the safety of IT systems is of the utmost importance. Furthermore, numerous stakeholders (managers, customers, employees, etc.) exist in the risk measures decision-making process for these IT systems, which makes it necessary to have a means of communicating risk measures so that stakeholders can easily form a consensus when necessary. For this purpose, we have developed a Multiple Risk Communicator (MRC) to assist in consensus formation within organizations and a Social-MRC system to support social consensus formation, which we have applied to various problems. This paper describes the considerations that IT system risk communication should take, describes the development of the necessary support systems, and provides information on the results of their application.
It is widely argued that today's largely reactive, “respond and patch” approach to securing cyber systems must yield to a new, more rigorous, more proactive methodology. Achieving this transformation is a difficult challenge. Building on insights into requirements for cyber science and on experience gained through 8 years of operation, the DETER project is addressing one facet of this problem: the development of transformative advances in methodology and facilities for experimental cybersecurity research and system evaluation. These advances in experiment design and research methodology are yielding progressive improvements not only in experiment scale, complexity, diversity, and repeatability, but also in the ability of researchers to leverage prior experimental efforts of others within the community. We describe in this paper the trajectory of the DETER project towards a new experimental science and a transformed facility for cyber-security research development and evaluation.