An efficient hardware implementation for Simultaneous Localization and Mapping (SLAM) methods is of necessity for mobile autonomous robots with limited computational resources. In this paper, we propose a resource-efficient FPGA implementation for accelerating scan matching computations, which typically cause a major bottleneck in 2D LiDAR SLAM methods. Scan matching is a process of correcting a robot pose by aligning the latest LiDAR measurements with an occupancy grid map, which encodes the information about the surrounding environment. We exploit an inherent parallelism in the Rao-Blackwellized Particle Filter (RBPF) based algorithm to perform scan matching computations for multiple particles in parallel. In the proposed design, several techniques are employed to reduce the resource utilization and to achieve the maximum throughput. Experimental results using the benchmark datasets show that the scan matching is accelerated by 5.31-8.75× and the overall throughput is improved by 3.72-5.10× without seriously degrading the quality of the final outputs. Furthermore, our proposed IP core requires only 44% of the total resources available in the TUL Pynq-Z2 FPGA board, thus facilitating the realization of SLAM applications on indoor mobile robots.
Cascading Style Sheets (CSS) is a popular language for describing the styles of XML documents as well as HTML documents. To resolve conflicts among CSS rules, CSS has a mechanism called specificity. For a DTD D and a CSS code R, due to specificity R may contain “unsatisfiable” rules under D, e.g., rules that are not applied to any element of any document valid for D. In this paper, we consider the problem of detecting unsatisfiable CSS rules under DTDs. We focus on CSS fragments in which descendant, child, adjacent sibling, and general sibling combinators are allowed. We show that the problem is coNP-hard in most cases, even if only one of the four combinators is allowed and under very restricted DTDs. We also show that the problem is in coNP or PSPACE depending on restrictions on DTDs and CSS. Finally, we present four conditions under which the problem can be solved in polynomial time.
Loading test vectors and unloading test responses in shift mode during scan testing cause many scan flip-flops to switch simultaneously. The resulting shift switching activity around scan flip-flops can cause excessive local IR-drop that can change the states of some scan flip-flops, leading to test data corruption. A common approach solving this problem is partial-shift, in which multiple scan chains are formed and only one group of the scan chains is shifted at a time. However, previous methods based on this approach use random grouping, which may reduce global shift switching activity, but may not be optimized to reduce local shift switching activity, resulting in remaining high risk of test data corruption even when partial-shift is applied. This paper proposes novel algorithms (one optimal and one heuristic) to group scan chains, focusing on reducing local shift switching activity around scan flip-flops, thus reducing the risk of test data corruption. Experimental results on all large ITC'99 benchmark circuits demonstrate the effectiveness of the proposed optimal and heuristic algorithms as well as the scalability of the heuristic algorithm.
Automatic generation of textual stories from visual data representation, known as visual storytelling, is a recent advancement in the problem of images-to-text. Instead of using a single image as input, visual storytelling processes a sequential array of images into coherent sentences. A story contains non-visual concepts as well as descriptions of literal object(s). While previous approaches have applied external knowledge, our approach was to regard the non-visual concept as the semantic correlation between visual modality and textual modality. This paper, therefore, presents new features representation based on a canonical correlation analysis between two modalities. Attention mechanism are adopted as the underlying architecture of the image-to-text problem, rather than standard encoder-decoder models. Canonical Correlation Attention Mechanism (CAAM), the proposed end-to-end architecture, extracts time series correlation by maximizing the cross-modal correlation. Extensive experiments on VIST dataset ( http://visionandlanguage.net/VIST/dataset.html ) were conducted to demonstrate the effectiveness of the architecture in terms of automatic metrics, with additional experiments show the impact of modality fusion strategy.
Online learning is a method which updates the model gradually and can modify and strengthen the previous model, so that the updated model can adapt to the new data without having to relearn all the data. However, the accuracy of the current online multiclass learning algorithm still has room for improvement, and the ability to produce sparse models is often not strong. In this paper, we propose a new Multiclass Truncated Gradient Confidence-Weighted online learning algorithm (MTGCW), which combine the Truncated Gradient algorithm and the Confidence-weighted algorithm to achieve higher learning performance. The experimental results demonstrate that the accuracy of MTGCW algorithm is always better than the original CW algorithm and other baseline methods. Based on these results, we applied our algorithm for phishing website recognition and image classification, and unexpectedly obtained encouraging experimental results. Thus, we have reasons to believe that our classification algorithm is clever at handling unstructured data which can promote the cognitive ability of computers to a certain extent.
Laser speech detection uses a non-contact Laser Doppler Vibrometry (LDV)-based acoustic sensor to obtain speech signals by precisely measuring voice-generated surface vibrations. Over long distances, however, the detected signal is very weak and full of speckle noise. To enhance the quality and intelligibility of the detected signal, we designed a two-sided Linear Prediction Coding (LPC)-based locator and interpolator to detect and replace speckle noise. We first studied the characteristics of speckle noise in detected signals and developed a binary-state statistical model for speckle noise generation. A two-sided LPC-based locator was then designed to locate the polluted samples, composed of an inverse decorrelator, nonlinear filter and threshold estimator. This greatly improves the detectability of speckle noise and avoids false/missed detection by improving the noise-to-signal-ratio (NSR). Finally, samples from both sides of the speckle noise were used to estimate the parameters of the interpolator and to code samples for replacing the polluted samples. Real-world speckle noise removal experiments and simulation-based comparative experiments were conducted and the results show that the proposed method is better able to locate speckle noise in laser detected speech and highly effective at replacing it.
In this paper, we present a novel portrait impression estimation method using nine pairs of semantic impression words: bitter-majestic, clear-pure, elegant-mysterious, gorgeous-mature, modern-intellectual, natural-mild, sporty-agile, sweet-sunny, and vivid-dynamic. In the first part of the study, we analyzed the relationship between the facial features in deformed portraits and the nine semantic impression word pairs over a large dataset, which we collected by a crowdsourcing process. In the second part, we leveraged the knowledge from the results of the analysis to develop a ranking network trained on the collected data and designed to estimate the semantic impression associated with a portrait. Our network demonstrated superior performance in impression estimation compared with current state-of-the-art methods.
Content-based image retrieval has been a hot topic among computer vision researchers for a long time. There have been many advances over the years, one of the recent ones being deep metric learning, inspired by the success of deep neural networks in many machine learning tasks. The goal of metric learning is to extract good high-level features from image pixel data using neural networks. These features provide useful abstractions, which can enable algorithms to perform visual comparison between images with human-like accuracy. To learn these features, supervised information of image similarity or relative similarity is often used. One important issue in deep metric learning is how to define similarity for multi-label or multi-object scenes in images. Traditionally, pairwise similarity is defined based on the presence of a single common label between two images. However, this definition is very coarse and not suitable for multi-label or multi-object data. Another common mistake is to completely ignore the multiplicity of objects in images, hence ignoring the multi-object facet of certain types of datasets. In our work, we propose an approach for learning deep image representations based on the relative similarity of both multi-label and multi-object image data. We introduce an intuitive and effective similarity metric based on the Jaccard similarity coefficient, which is equivalent to the intersection over union of two label sets. Hence we treat similarity as a continuous, as opposed to discrete quantity. We incorporate this similarity metric into a triplet loss with an adaptive margin, and achieve good mean average precision on image retrieval tasks. We further show, using a recently proposed quantization method, that the resulting deep feature can be quantized whilst preserving similarity. We also show that our proposed similarity metric performs better for multi-object images than a previously proposed cosine similarity-based metric. Our proposed method outperforms several state-of-the-art methods on two benchmark datasets.
Based on TLS-ESPRIT algorithm, this paper proposes a weighted spatial smoothing DOA estimation algorithm to address the problem that the conventional TLS-ESPRIT algorithm will be disabled to estimate the direction of arrival (DOA) in the scenario of coherent sources. The proposed method divides the received signal array into several subarrays with special structural feature. Then, utilizing these subarrays, this paper constructs the new weighted covariance matrix to estimate the DOA based on TLS-ESPRIT. The auto-correlation and cross-correlation information of subarrays in the proposed algorithm is extracted sufficiently, improving the orthogonality between the signal subspace and the noise subspace so that the DOA of coherent sources could be estimated accurately. The simulations show that the proposed algorithm is superior to the conventional spatial smoothing algorithms under different signal to noise ratio (SNR) and snapshot numbers with coherent sources.
As NAND flash-based storage has been settled, a flash translation layer (FTL) has been in charge of mapping data addresses on NAND flash memory. Many FTLs implemented various mapping schemes, but the amount of mapping data depends on the mapping level. However, the FTL should contemplate mapping consistency irrespective of how much mapping data dwell in the storage. Furthermore, the recovery cost by the inconsistency needs to be considered for a faster storage reboot time. This letter proposes a novel method that enhances the consistency for a page-mapping level FTL running a legacy logging policy. Moreover, the recovery cost of page mappings also decreases. The novel method is to adopt a virtually-shrunk segment and deactivate page-mapping logs by assembling and storing the segments. This segment scheme already gave embedded NAND flash-based storage enhance its response time in our previous study. In addition to that improved result, this novel plan maximizes the page-mapping consistency, therefore improves the recovery cost compared with the legacy page-mapping FTL.
SPHINCS+ is a state-of-the-art post-quantum hash-based signature that is a candidate for the NIST post-quantum cryptography standard. For a target bit security, SPHINCS+ supports many different tradeoffs between the signature size and the signing speed. SPHINCS+ provides 6 parameter sets: 3 parameter sets for size optimization and 3 parameter sets for speed optimization. We propose new parameter sets with better performance. Specifically, SPHINCS+ implementations with our parameter sets are up to 26.5% faster with slightly shorter signature sizes.
Existing cyber deception technologies (e.g., operating system obfuscation) can effectively disturb attackers' network reconnaissance and hide fingerprint information of valuable cyber assets (e.g., containers). However, they exhibit ineffectiveness against skilled attackers. In this study, a proactive fingerprint deception method is proposed, termed as Continuously Anonymizing Containers' Fingerprints (CACF), which modifies the container's fingerprint in the cloud resource pool to satisfy the anonymization standard. As demonstrated by experimental results, the CACF can effectively increase the difficulty for attackers.
This paper proposes a change detection method for buildings based on convolutional neural networks. The proposed method detects building changes from pairs of optical aerial images and past map information concerning buildings. Using high-resolution image pair and past map information seamlessly, the proposed method can capture the building areas more precisely compared to a conventional method. Our experimental results show that the proposed method outperforms the conventional change detection method that uses optical aerial images to detect building changes.
Multi-objective evolutionary algorithms are widely used in many engineering optimization problems and artificial intelligence applications. Ant lion optimizer is an outstanding evolutionary method, but two issues need to be solved to extend it to the multi-objective optimization field, one is how to update the Pareto archive, and the other is how to choose elite and ant lions from archive. We develop a novel multi-objective variant of ant lion optimizer in this paper. A new measure combining Pareto dominance relation and distance information of individuals is put forward and used to tackle the first issue. The concept of time weight is developed to handle the second problem. Besides, mutation operation is adopted on solutions in middle part of archive to further improve its performance. Eleven functions, other four algorithms and four indicators are taken to evaluate the new method. The results show that proposed algorithm has better performance and lower time complexity.
Deep learning has shown outstanding performance in various fields, and it is increasingly deployed in privacy-critical domains. If sensitive data in the deep learning model are exposed, it can cause serious privacy threats. To protect individual privacy, we propose a novel activation function and stochastic gradient descent for applying differential privacy to deep learning. Through experiments, we show that the proposed method can effectively protect the privacy and the performance of proposed method is better than the previous approaches.
We propose a video magnification method for magnifying subtle color and motion changes under the presence of non-meaningful background motions. We use frequency variability to design a filter that passes only meaningful subtle changes and removes non-meaningful ones; our method obtains more impressive magnification results without artifacts than compared methods.
Source retrieval is the primary task of plagiarism detection. It searches the documents that may be the sources of plagiarism to a suspicious document. The state-of-the-art approaches usually rely on the classical information retrieval models, such as the probability model or vector space model, to get the plagiarism sources. However, the goal of source retrieval is to obtain the source documents that contain the plagiarism parts of the suspicious document, rather than to rank the documents relevant to the whole suspicious document. To model the “partial matching” between documents, this paper proposes a Partial Matching Convolution Neural Network (PMCNN) for source retrieval. In detail, PMCNN exploits a sequential convolution neural network to extract the plagiarism patterns of contiguous text segments. The experimental results on PAN 2013 and PAN 2014 plagiarism source retrieval corpus show that PMCNN boosts the performance of source retrieval significantly, outperforming other state-of-the-art document models.
This letter presents an efficient technique to reduce the computational complexity involved in training binary convolutional neural networks (BCNN). The BCNN training shall be conducted focusing on the optimization of the sign of each weight element rather than the exact value itself in convention; in which, the sign of an element is not likely to be flipped anymore after it has been updated to have such a large magnitude to be clipped out. The proposed technique does not update such elements that have been clipped out and eliminates the computations involved in their optimization accordingly. The complexity reduction by the proposed technique is as high as 25.52% in training the BCNN model for the CIFAR-10 classification task, while the accuracy is maintained without severe degradation.