Peg solitaire is a single-player board game. The goal of the game is to remove all but one peg from the game board. Peg solitaire on graphs is a peg solitaire played on arbitrary graphs. A graph is called solvable if there exists some vertex s such that it is possible to remove all but one peg starting with s as the initial hole. In this paper, we prove that it is NP-complete to decide if a graph is solvable or not.
Many deep convolutional neural network (CNN) inference accelerators on the field-programmable gate array (FPGA) platform have been widely adopted due to their low power consumption and high performance. In this paper, we develop the following to improve performance and power efficiency. First, we use a high bandwidth memory (HBM) to expand the bandwidth of data transmission between the off-chip memory and the accelerator. Second, a fully-pipelined manner, which consists of pipelined inter-layer computation and a pipelined computation engine, is implemented to decrease idle time among layers. Third, a multi-core architecture with shared-dual buffers is designed to reduce off-chip memory access and maximize the throughput. We designed the proposed accelerator on the Xilinx Alveo U280 platform with in-depth Verilog HDL instead of high-level synthesis as the previous works and explored the VGG-16 model to verify the system during our experiment. With a similar accelerator architecture, the experimental results demonstrate that the memory bandwidth of HBM is 13.2× better than DDR4. Compared with other accelerators in terms of throughput, our accelerator is 1.9×/1.65×/11.9× better than FPGA+HBM2 based/low batch size (4) GPGPU/low batch size (4) CPU. Compared with the previous DDR+FPGA/DDR+GPGPU/DDR+CPU based accelerators in terms of power efficiency, our proposed system provides 1.4-1.7×/1.7-12.6×/6.6-37.1× improvement with the large-scale CNN model.
The increasing attention to the interpretability of machine learning models has led to the development of methods to explain the behavior of black-box models in a post-hoc manner. However, such post-hoc approaches generate a new explanation for every new input, and these explanations cannot be checked by humans in advance. A method that selects decision rules from a finite ruleset as explanation for neural networks has been proposed, but it cannot be used for other models. In this paper, we propose a model-agnostic explanation method to find a pre-verifiable finite ruleset from which a decision rule is selected to support every prediction made by a given black-box model. First, we define an explanation model that selects the rule, from a ruleset, that gives the closest prediction; this rule works as an alternative explanation or supportive evidence for the prediction of a black-box model. The ruleset should have high coverage to give close predictions for future inputs, but it should also be small enough to be checkable by humans in advance. However, minimizing the ruleset while keeping high coverage leads to a computationally hard combinatorial problem. Hence, we show that this problem can be reduced to a weighted MaxSAT problem composed only of Horn clauses, which can be efficiently solved with modern solvers. Experimental results showed that our method found small rulesets such that the rules selected from them can achieve higher accuracy for structured data as compared to the existing method using rulesets of almost the same size. We also experimentally compared the proposed method with two purely rule-based models, CORELS and defragTrees. Furthermore, we examine rulesets constructed for real datasets and discuss the characteristics of the proposed method from different viewpoints including interpretability, limitation, and possible use cases.
One key to implementing the smart city is letting the smart space know where and how many people are. The visual method is a scheme to recognize people with high accuracy, but concerns arise regarding potential privacy leakage and user nonacceptance. Besides, being functional in a limited environment in an emergency should also be considered. We propose a real-time people counting and tracking system based on a millimeter wave radar (mmWave) as an alternative to the optical solutions in a restaurant. The proposed method consists of four main procedures. First, capture the point cloud of obstacles and generate them using a low-cost, commercial off-the-shelf (COTS) mmWave radar. Next, cluster the individual point with similar properties. Then the same people in sequential frames would be associated with the tracking algorithm. Finally, the estimated people would be counted, tracked, and shown in the next frame. The experiment results show that our proposed system provided a median position error of 0.17 m and counting accuracy of 83.5% for ten insiders in various scenarios in an actual restaurant environment. In addition, the real-time estimation and visualization of people's numbers and positions show a potential capability to help prevent crowds during the pandemic of Covid-19 and analyze customer visitation patterns for efficient management and target marketing.
Many countries are facing the aging problem caused by the growth of the elderly population. Nursing home (NH) is a common solution to long-term care for the elderly. This paper develops a simulator to model elder behavior in an NH, which considers public areas where elders interact and imitates their general, group, and special activities. Elders have their preferences to decide activities taken by them. The simulator takes account of the movement of elders and abnormal events. Based on the simulator, two seeking methods are proposed for caregivers to search lost elders efficiently, which helps them fast find out elders who may incur accidents.
We propose a framework for the integration of heterogeneous networks in human pose estimation (HPE) with the aim of balancing accuracy and computational complexity. Although many existing methods can improve the accuracy of HPE using multiple frames in videos, they also increase the computational complexity. The key difference here is that the proposed heterogeneous framework has various networks for different types of frames, while existing methods use the same networks for all frames. In particular, we propose to divide the video frames into two types, including key frames and non-key frames, and adopt three networks including slow networks, fast networks, and transfer networks in our heterogeneous framework. For key frames, a slow network is used that has high accuracy but high computational complexity. For non-key frames that follow a key frame, we propose to warp the heatmap of a slow network from a key frame via a transfer network and fuse it with a fast network that has low accuracy but low computational complexity. Furthermore, when extending to the usage of long-term frames where a large number of non-key frames follow a key frame, the temporal correlation decreases. Therefore, when necessary, we use an additional transfer network that warps the heatmap from a neighboring non-key frame. The experimental results on PoseTrack 2017 and PoseTrack 2018 datasets demonstrate that the proposed FSPose achieves a better balance between accuracy and computational complexity than the competitor method. Our source code is available at https://github.com/Fenax79/fspose.
Neuromorphic computing with a spiking neural network (SNN) is expected to provide a complement or alternative to deep learning in the future. The challenge is to develop optimal SNN models, algorithms, and engineering technologies for real use cases. As a potential use cases for neuromorphic computing, we have investigated a person monitoring and worker support with a video surveillance system, given its status as a proven deep neural network (DNN) use case. In the future, to increase the number of cameras in such a system, we will need a scalable approach that embeds only a few neuromorphic devices in a camera. Specifically, this will require a shallow SNN model that can be implemented in a few neuromorphic devices while providing a high recognition accuracy comparable to a DNN with the same configuration. A shallow SNN was built by converting ResNet, a proven DNN for image recognition, and a new configuration of the shallow SNN model was developed to improve its accuracy. The proposed shallow SNN model was evaluated with a few neuromorphic devices, and it achieved a recognition accuracy of more than 80% with about 1/130 less energy consumption than that of a GPU with the same configuration of DNN as that of SNN.
Fully homomorphic encryption (FHE) enables secret computations. Users can perform computation using data encrypted with FHE without decryption. Uploading private data without encryption to a public cloud has the risk of data leakage, which makes many users hesitant to utilize a public cloud. Uploading data encrypted with FHE avoids this risk, while still providing the computing power of the public cloud. In many cases, data are stored in HDDs because the data size increases significantly when FHE is used. One important data analysis is Apriori data mining. In this application, two files are accessed alternately, and this causes long-distance seeking on its HDD and low performance. In this paper, we propose a new striping layout with reservations for write areas. This method intentionally fragments files and arranges blocks to reduce the distance between blocks in a file and another file. It reserves the area for intermediate files of FHE Apriori. The performance of the proposed method was evaluated based on the I/O processing of a large FHE Apriori, and the results showed that the proposed method could improve performance by up to approximately 28%.