One of the patterns that the design of parallel file systems has to solve stems from the difficulty of handling the metadata-intensive I/O generated by parallel applications accessing a single large directory. We demonstrate a middleware design called SFS to support existing parallel file systems for distributed and scalable directory service. SFS distributes directory entries over data servers instead of metadata servers to offer increased scalability and performance. Firstly, SFS exploits an adaptive directory partitioning based on extendible hashing to support concurrent and unsynchronized partition splitting. Secondly, SFS describes an optimization based on recursive split-ordering that emphasizes speeding up the splitting process. Thirdly, SFS applies a write-optimized index structure to convert slow, small, random metadata updates into fast, large, sequential writes. Finally, SFS gracefully tolerates stale mapping at the clients while maintaining the correctness and consistency of the system. Our performance results on a cluster of 32-servers show our implementation can deliver more than 250,000 file creations per second on average.
The inertia weight is the control parameter that tunes the balance between the exploration and exploitation movements in particle swarm optimization searches. Since the introduction of inertia weight, various strategies have been proposed for determining the appropriate inertia weight value. This paper presents a brief review of the various types of inertia weight strategies which are classified and discussed in four categories: static, time varying, dynamic, and adaptive. Furthermore, a novel entropy-based gain regulator (EGR) is proposed to detect the evolutionary state of particle swarm optimization in terms of the distances from particles to the current global best. And then apply proper inertia weights with respect to the corresponding distinct states. Experimental results on five widely applied benchmark functions show that the EGR produced significant improvements of particle swarm optimization.
Quantum computer simulators play an important role when we evaluate quantum algorithms. Quantum computation can be regarded as parallel computation in some sense, and thus, it is suitable to implement a simulator on hardware that can process a lot of operations in parallel. In this paper, we propose a hardware quantum computer simulator. The proposed simulator is based on the register reordering method that shifts and swaps registers containing probability amplitudes so that the probability amplitudes of target basis states can be quickly selected. This reduces the number of large multiplexers and improves clock frequency. We implement the simulator on an FPGA. Experiments show that the proposed simulator has scalability in terms of the number of quantum bits, and can simulate quantum algorithms faster than software simulators.
Dynamic instruction window resizing (DIWR) is a scheme that effectively exploits both memory-level parallelism and instruction-level parallelism by configuring the instruction window size appropriately for exploiting each parallelism. Although a previous study has shown that the DIWR processor achieves a significant speedup, power consumption has not been explored. The power consumption is increased in DIWR because the instruction window resources are enlarged in memory-intensive phases. If the power consumption exceeds the power budget determined by certain requirements, the DIWR processor must save power and thus, the performance previously presented cannot be achieved. In this paper, we explore to what extent the DIWR processor can achieve improved performance for a given power budget, assuming that dynamic voltage and frequency scaling (DVFS) is introduced as a power saving technique. Evaluation results using the SPEC2006 benchmark programs show that the DIWR processor, even with a constrained power budget, achieves a speedup over the conventional processor over a wide range of given power budgets. At the most important power budget point, i.e., when the power a conventional processor consumes without any power constraint is supplied, DIWR achieves a 16% speedup.
Disassembly, as a principal reverse-engineering tool, is the process of recovering the equivalent assembly instructions of a program's machine code from its binary representation. However, when disassembling a firmware file, the disassembly process cannot be performed well if the image base is unknown. In this paper, we propose an innovative method to determine the image base of a firmware file with ARM/Thumb instruction set. First, based on the characteristics of the function entry table (FET) for an ARM processor, an algorithm called FIND-FET is proposed to identify the function entry tables. Second, by using the most common instructions of function prologue and FETs, the FIND-BASE algorithm is proposed to determine the candidate image base by counting the matched functions and then choose the one with maximal matched FETs as the final result. The algorithms are applied on some firmwares collected from the Internet, and results indicate that they can effectively find out the image base for the majority of example firmware files.
Virtualization is no longer an emerging research area since the virtual processor and memory operate as efficiently as the physical ones. However, I/O performance is still restricted by the virtualization overhead caused by the costly and complex I/O virtualization mechanism, in particular by massive exits occurring on the guest-host switch and redundant processing of the I/O stacks at both guest and host. A para-virtual device driver may reduce the number of exits to the hypervisor, whereas the network stacks in the guest OS are still duplicated. Previous work proposed a socket-outsourcing technique that bypasses the redundant guest network stack by delivering the network request directly to the host. However, even by bypassing the redundant network paths in the guest OS, the obtained performance was still below 60% of the native device, since notifications of completion still depended on the hypervisor. In this paper, we propose vCanal, a novel network virtualization framework, to improve the performance of network access in the virtual machine toward that of the native machine. Implementation of vCanal reached 96% of the native TCP throughput, increasing the UDP latency by only 4% compared to the native latency.
Data deduplication is a technology that eliminates redundant data to save storage space. Most previous studies on data deduplication target backup storage, where the deduplication ratio and throughput are important. However, data deduplication on primary storage has recently been receiving attention; in this case, I/O latency should be considered equally with the deduplication ratio. Unfortunately, data deduplication causes high sequential-read-latency problems. When a file is created, the file system allocates physically contiguous blocks to support low sequential-read latency. However, the data deduplication process rearranges the block mapping information to eliminate duplicate blocks. Because of this rearrangement, the physical sequentiality of blocks in a file is broken. This makes a sequential-read request slower because it operates like a random-read operation. In this paper, we propose a selective data deduplication scheme for primary storage systems. A selective scheme can achieve a high deduplication ratio and a low I/O latency by applying different data-chunking methods to the files, according to their file access characteristics. In the proposed system, file accesses are characterized by recent access time and the access frequency of each file. No chunking is applied to update-intensive files since they are meaningless in terms of data deduplication. For sequential-read-intensive files, we apply big chunking to preserve their sequentiality on the media. For random-read-intensive files, small chunking is used to increase the deduplication ratio. Experimental evaluation showed that the proposed method achieves a maximum of 86% of an ideal deduplication ratio and 97% of the sequential-read performance of a native file system.
In this paper, we propose a communication-efficient top-k continuous query processing method on distributed local nodes where data are horizontally partitioned. A designated coordinator server takes the role of issuing queries from users to local nodes and delivering the results to users. The final results are requested via a top-k subscription which lets local nodes know which data and updates need to be returned to users. Our proposed method makes use of the active previously posed queries to identify a small set of needed top-k subscriptions. In addition, with the pre-indexed nodes' skylines, the number of local nodes to be subscribed can be significantly reduced. As a result, only a small number of subscriptions are informed to a small number of local nodes resulting in lower communication overhead. Furthermore, according to dynamic data updates, we also propose a method that prevents nodes from reporting needless updates and also maintenance procedures to preserve the consistency. The results of experiments that measure the volume of transferred data show that our proposed method significantly outperforms the previously proposed methods.
Nowadays, with the development of online social networks (OSN), a mass of online social information has been generated in OSN, which has triggered research on social recommendation. Collaborative filtering, as one of the most popular techniques in social recommendation, faces several challenges, such as data sparsity, cold-start users and prediction quality. The motivation of our work is to deal with the above challenges by effectively combining collaborative filtering technology with social information. The trust relationship has been identified as a useful means of using social information to improve the quality of recommendation. In this paper, we propose a trust-based recommendation approach which uses GlobalTrust (GT) to represent the trust value among users as neighboring nodes. A matrix factorization based on singular value decomposition is used to get a trust network built on the GT value. The recommendation results are obtained through a modified random walk algorithm called GlobalTrustWalker. Through experiments on a real-world sparser dataset, we demonstrate that the proposed approach can better utilize users' social trust information and improve the recommendation accuracy on cold-start users.
Traffic is a key aspect of everyday life. Its study, as it happens with other complex phenomena, has found in simulation a basic tool. However, the use of simulations faces important limitations. Building them requires considering different aspects of traffic (e.g. urbanism, car features, and individual drivers) with their specific theories, that must be integrated to provide a coherent model. There is also a variety of simulation platforms with different requirements. Many of these problems demand multi-disciplinary teams, where the different backgrounds can hinder the communication and validation of simulations. The Model-Driven Engineering (MDE) of simulations has been proposed in other fields to address these issues. Such approaches develop graphical Modelling Languages (MLs) that researchers use to model their problems, and then semi-automatically generate simulations from those models. Working in this way promotes communication, platform independence, incremental development, and reutilisation. This paper presents the first steps for a MDE framework for traffic simulations. It introduces a tailored extensible ML for domain experts. The ML is focused on human actions, so it adopts an Agent-Based Modelling perspective. Regarding traffic aspects, it includes concepts commonly found in related literature following the Driver-Vehicle-Environment model. The language is also suitable to accommodate additional theories using its extension mechanisms. The approach is supported by an infrastructure developed using Eclipse MDE projects: the ML is specified with Ecore, and a model editor and a code generator tools are provided. A case study illustrates how to develop a simulation based on a driver's behaviour theory for a specific target platform using these elements.
The new generation of telemedicine systems enables healthcare service providers to monitor patients not only in the hospital but also when they are at home. In order to efficiently exploit these systems, human information collected from end devices must be sent to the medical center through reliable data transmission. In this paper, we propose an adaptive relay transmission scheme to improve the reliability of data transmission for wireless body area networks. In our proposal, relay nodes that have successfully decoded a packet from the source node are selected as relay nodes in which the best relay with the highest channel gain is selected to forward the failed packet instead of the source node. The scheme addresses both the data collision problem and the inefficient relay selection in relay transmission. Our experimental results show that the proposed scheme provides a better performance than previous works in terms of the packet delivery ratio and end-to-end delay.
This paper presents a practical system which allows instructors to easily introduce 3D games utilizing smartphones in a classroom. The system consists of a PC server, a big screen and smartphone clients. The server provides 3D models, so no 3D authoring is needed when using this system. For an instructor, preparing slides of quiz-questions with the correct answers is all that is required when designing 3D games. According to a quiz specified by an instructor, this system constructs a corresponding 3D game scene. The answers students provide on their smartphones will be used to play this game. Everyone in the classroom can see this 3D game in real time on a big screen. The game illustrates how every student has reacted to a quiz. This system also introduces specialized queues for mobile interactions; a queue for commands from an instructor and a queue for data from students. The command queue has higher priority than the data queue; so that an instructor can control this system by sending commands with clicks on a smartphone. Previous studies have mostly provided specially designed teaching materials to instructors, often treating them as passive consultants. However, by using slides, already familiar to instructors, this system enables instructors to combine their own teaching materials with 3D games in the classroom. Moreover, 3D games are expected to further motivate students to actively participate in classroom activities. This system is evaluated in this paper.
Zero-shot learning refers to the object classification problem where no training samples are available for testing classes. For zero-shot learning, attribute transfer plays an important role in recognizing testing classes. One popular method is the indirect attribute prediction (IAP) model, which assumes that all attributes are independent and equally important for learning the zero-shot image classifier. However, a more practical assumption is that different attributes contribute unequally to the classifier learning. We therefore propose assigning different weights for the attributes based on the relevance probabilities between the attributes and the classes. We incorporate such weighed attributes to IAP and propose a relevance probability-based indirect attribute weighted prediction (RP-IAWP) model. Experiments on four popular attributed-based learning datasets show that, when compared with IAP and RFUA, the proposed RP-IAWP yields more accurate attribute prediction and zero-shot image classification.
In lossy image/video encoding, there is a compromise between the number of bits and the extent of distortion. Optimizing the allocation of bits to different sources, such as frames or blocks, can improve the encoding performance. In intra-frame encoding, due to the dependency among macro blocks (MBs) introduced by intra prediction, the optimization of bit allocation to the MBs usually has high complexity. So far, no practical optimal bit allocation methods for intra-frame encoding exist, and the commonly used method for intra-frame encoding is the fixed-QP method. We suggest that the QP selection inside an image/a frame can be optimized by aiming at the constant perceptual quality (CPQ). We proposed an iteration-based bit allocation scheme for H.264/AVC intra-frame encoding, in which all the local areas (which is defined by a group of MBs (GOMBs) in this paper) in the frame are encoded to have approximately the same perceptual quality. The SSIM index is used to measure the perceptual quality of the GOMBs. The experimental results show that the encoding performance on intra-frames can be improved greatly by the proposed method compared with the fixed-QP method. Furthermore, we show that the optimization on the intra-frame can bring benefits to the whole sequence encoding, since a better reference frame can improve the encoding of the subsequent frames. The proposed method has acceptable encoding complexity for offline applications.
Bag of Visual Words (BoVW) is an effective framework for image retrieval. Query expansion (QE) further boosts retrieval performance by refining a query with relevant visual words found from the geometric consistency check between the query image and highly ranked retrieved images obtained from the first round of retrieval. Since QE checks the pairwise consistency between query and highly ranked images, its performance may deteriorate when there are slight degradations in the query image. We propose Query Bootstrapping as a variant of QE to circumvent this problem by using the consistency of highly ranked images instead of pairwise consistency. In so doing, we regard frequently co-occurring visual words in highly ranked images as relevant visual words. Frequent itemset mining (FIM) is used to find such visual words efficiently. However, the FIM-based approach requires sensitive parameters to be fine-tuned, namely, support (min/max-support) and the number of top ranked images (top-k). Here, we propose an adaptive support algorithm that adaptively determines both the minimum support and maximum support by referring to the first round's retrieval list. Selecting relevant images by using a geometric consistency check further boosts retrieval performance by reducing outlier images from a mining process. An important parameter for the LO-RANSAC algorithm that is used for the geometric consistency check, namely, inlier threshold, is automatically determined by our algorithm. We further introduce tf-fi-idf on top of tf-idf in order to take into account the frequency of inliers (fi) in the retrieved images. We evaluated the performance of QB in terms of mean average precision (mAP) on three benchmark datasets and found that it gave significant performance boosts of 5.37%, 9.65%, and 8.52% over that of state-of-the-art QE on Oxford 5k, Oxford 105k, and Paris 6k, respectively.
This paper proposes an efficient video object segmentation approach that is tolerant to complex scene dynamics. Unlike existing approaches that rely on estimating object-like proposals on an intra-frame basis, the proposed approach employs temporally consistent foreground hypothesis using nonlinear regression of saliency guided proposals across a video sequence. For this purpose, we first generate salient foreground proposals at superpixel level by leveraging a saliency signature in the discrete cosine transform domain. We propose to use a random forest based nonlinear regression scheme to learn both appearance and shape features from salient foreground regions in all frames of a sequence. Availability of such features can help rank every foreground proposals of a sequence, and we show that the regions with high ranking scores are well correlated with semantic foreground objects in dynamic scenes. Subsequently, we utilize a Markov Random Field to integrate both appearance and motion coherence of the top-ranked object proposals. A temporal nonlinear regressor for generating salient object support regions significantly improves the segmentation performance compared to using only per-frame objectness cues. Extensive experiments on challenging real-world video sequences are performed to validate the feasibility and superiority of the proposed approach for addressing dynamic scene segmentation.
The development of image acquisition technology and display technology provide the base for popularization of high-resolution images. On the other hand, the available bandwidth is not always enough to data stream such high-resolution images. Down- and up-sampling, which decreases the data volume of images and increases back to high-resolution images, is a solution for the transmission of high-resolution images. In this paper, motivated by the observation that the high-frequency DCT components are sparse in the spatial domain, we propose a scheme combined with Discrete Cosine Transform (DCT) and Compressed Sensing (CS) to achieve arbitrary-ratio down-sampling. Our proposed scheme makes use of two properties: First, the energy of a image concentrates on the low-frequency DCT components. Second, the high-frequency DCT components are sparse in the spatial domain. The scheme is able to preserve the most information and avoid absolutely blindly estimating the high-frequency components. Experimental results show that the proposed down- and up-sampling scheme produces better performance compared with some state-of-the-art schemes in terms of peak signal to noise ratio (PSNR), structural similarity index measurement (SSIM) and processing time.
Recently, object-proposal methods have attracted more and more attention of scholars and researchers for its utility in avoiding exhaustive sliding window search in an image. Object-proposal method is inspired by a concept that objects share a common feature. There exist many object-proposal methods which are either in segmentation fashion or engineering categories depending on low-level feature. Among those object-proposal methods, Edge Boxes, which is based on the number of contours that a bounding box wholly contains, has the state of art performance. Since Edge Boxes sometimes misses proposing some obvious objects in some images, we propose an appropriate version of it based on our two observations. We call the appropriate version as Improved Edge Boxes. The first of our observations is that objects have a property which can help us distinguish them from the background. It is called object saliency. An appropriate way we employ to calculate object saliency can help to retrieve some objects. The second of our observations is that objects ‘prefer’ to appear at the center part of images. For this reason, a bounding box that appears at the center part of the image is likely to contain an object. These two observations are going to help us retrieve more objects while promoting the recall performance. Finally, our results show that given just 5000 proposals we achieve over 89% object recall but 87% in Edge Boxes at the challenging overlap threshold of 0.7. Further, we compare our approach to some state-of-the-art approaches to show that our results are more accurate and faster than those approaches. In the end, some comparative pictures are shown to indicate intuitively that our approach can find more objects and more accurate objects than Edge Boxes.
Random forest regressor has recently been proposed as a local landmark estimator in the face alignment problem. It has been shown that random forest regressor can achieve accurate, fast, and robust performance when coupled with a global face-shape regularizer. In this paper, we extend this approach and propose a new Local Forest Classification and Regression (LFCR) framework in order to handle face images with large yaw angles. Specifically, the LFCR has an additional classification step prior to the regression step. Our experiment results show that this additional classification step is useful in rejecting outliers prior to the regression step, thus improving the face alignment results. We also analyze each system component through detailed experiments. In addition to the selection of feature descriptors and several important tuning parameters of the random forest regressor, we examine different initialization and shape regularization processes. We compare our best outcomes to the state-of-the-art system and show that our method outperforms other parametric shape-fitting approaches.
Vanishing point estimation is an important issue for vision based road detection, especially in unstructured roads. However, most of the existing methods suffer from the long calculating time. This paper focuses on improving the efficiency of vanishing point estimation by using a heuristic voting method based on particle swarm optimization (PSO). Experiments prove that with our proposed method, the efficiency of vanishing point estimation is significantly improved with almost no loss in accuracy. Moreover, for sequenced images, this method is further improved and can get even better performance, by making full use of inter-frame information to optimize the performance of PSO.
In everyday life, people use past events and their own knowledge in predicting probable unfolding of events. To obtain the necessary knowledge for such predictions, newspapers and the Internet provide a general source of information. Newspapers contain various expressions describing past events, but also current and future events, and opinions. In our research we focused on automatically obtaining sentences that make reference to the future. Such sentences can contain expressions that not only explicitly refer to future events, but could also refer to past or current events. For example, if people read a news article that states “In the near future, there will be an upward trend in the price of gasoline,” they may be likely to buy gasoline now. However, if the article says “The cost of gasoline has just risen 10 yen per liter,” people will not rush to buy gasoline, because they accept this as reality and may expect the cost to decrease in the future. In the following study we firstly investigate future reference sentences in newspapers and Web news. Next, we propose a method for automatic extraction of such sentences by using semantic role labels, without typical approaches (temporal expressions, etc.). In a series of experiments, we extract semantic role patterns from future reference sentences and examine the validity of the extracted patterns in classification of future reference sentences.
A job is called just-in-time if it is completed exactly on its due date. Under multi-slot conditions, each job has one due date per time slot and has to be completed just-in-time on one of its due dates. Moreover, each job has a certain weight per time slot. We would like to find a just-in-time schedule that maximizes the total weight under multi-slot conditions. In this paper, we prove that this problem is NP-hard.
Recently, Ku et al. proposed a sector-based graphical password scheme, RiS, with dynamically adjustable resistance to login-recording attacks. However, since most users are more familiar with textual passwords than graphical passwords, we propose a secure and efficient textual-graphical password scheme, T-RiS, which is a variant of RiS. The T-RiS user can efficiently complete the login process in an environment under low threat of login-recording attacks and securely complete the login process in an environment under high threat of login-recording attacks. T-RiS can be used in environments where the users are more familiar with passwords based on texts than passwords based on icons/images and the number of login sessions the adversary can record is usually less than five.
In this letter, the problem of how to set reserve prices so as to improve the primary user's revenue in the second price-sealed auction under the incomplete information of secondary users' private value functions is investigated. Dirichlet process is used to predict the next highest bid based on historical data of the highest bids. Before the beginning of the next auction round, the primary user can obtain a reserve price by maximizing the additional expected reward. Simulation results show that the proposed scheme can achieve an improvement of the primary user's averaged revenue compared with several counterparts.
Compared to the traditional functional dependency (FD), the extended conditional functional dependency (CFD) has shown greater potential for detecting and repairing inconsistent data. CFDMiner is a widely used algorithm for mining constant-CFDs. But the search space of CFDMiner is too large, and there is still room for efficiency improvement. In this paper, an efficient pruning strategy is proposed to optimize the algorithm by reducing the search space. Both theoretical analysis and experiments have proved the optimized algorithm can produce the consistent results as the original CFDMiner.
Traditional low-rank feature lose the temporal information among action sequence. To obtain the temporal information, we split an action video into multiple action subsequences and concatenate all the low-rank features of subsequences according to their time order. Then we recognize actions by learning a novel dictionary model from concatenated low-rank features. However, traditional dictionary learning models usually neglect the similarity among the coding coefficients and have bad performance in dealing with non-linearly separable data. To overcome these shortcomings, we present a novel similarity constrained discriminative kernel dictionary learning for action recognition. The effectiveness of the proposed method is verified on three benchmarks, and the experimental results show the promising results of our method for action recognition.
The Euler number is an important topological property in a binary image, and it can be computed by counting certain bit-quads in the binary image. This paper proposes a further improved bit-quad-based algorithm for computing the Euler number. By scanning image rows two by two and utilizing the information obtained while processing the previous pixels, the number of pixels to be checked for processing a bit-quad can be decreased from 2 to 1.5. Experimental results demonstrated that our proposed algorithm significantly outperforms conventional Euler number computing algorithms.
Recently, notable improvements in voice activity detection (VAD) problem have been achieved by adopting several machine learning techniques. Among them, the deep neural network (DNN) which learns the mapping between the noisy speech features and the corresponding voice activity status with its deep hidden structure has been one of the most popular techniques. In this letter, we propose a novel approach which enhances the robustness of DNN in mismatched noise conditions with multi-task learning (MTL) framework. In the proposed algorithm, a feature enhancement task for speech features is jointly trained with the conventional VAD task. The experimental results show that the DNN with the proposed framework outperforms the conventional DNN-based VAD algorithm.
Detecting small infrared targets is a difficult but important task in highly cluttered coastal surveillance. The paper proposed a method called low-rank and sparse decomposition based frame difference to improve the detection performance of a surveillance system. First, the frame difference is used in adjacent frames to detect the candidate object regions which we are most interested in. Then we further exclude clutters by low-rank and sparse matrix recovery. Finally, the targets are extracted from the recovered target component by a local self-adaptive threshold. The experiment results show that, the method could effectively enhance the system's signal-to-clutter ratio gain and background suppression factor, and precisely extract target in highly cluttered coastal scene.