The memory and storage system, including processor caches, main memory, and storage, is an important component of various computer systems. The memory hierarchy is becoming a fundamental performance and energy bottleneck, due to the widening gap between the increasing bandwidth and energy demands of modern applications and the limited performance and energy efficiency provided by traditional memory technologies. As a result, computer architects are facing significant challenges in developing high-performance, energy-efficient, and reliable memory hierarchies. New byte-addressable nonvolatile memories (NVMs) are emerging with unique properties that are likely to open doors to novel memory hierarchy designs to tackle the challenges. However, substantial advancements in redesigning the existing memory and storage organizations are needed to realize their full potential. This article reviews recent innovations in rearchitecting the memory and storage system with NVMs, producing high-performance, energy-efficient, and scalable computer designs.
This paper proposes an efficient performance estimation method for configurable multi-layer bus-based SoC, which evaluates system performance in an early stage of design process. The proposed method uses data flow information obtained from a system-level profiling, an architecture-independent loosely-timed transaction level simulation, and constructs a system-level execution dependency graph. Then, based on each architecture-level model, the architecture-level execution dependency graph is constructed and analyzed to estimate the performance of each architecture. In the analysis, the behavior details of shared buses and multi-layer bus are determined based on the analyzed dynamic bus contention and bus protocols' features. Experiments were conducted by modeling the multi-layer AHB and applying the method to estimate performance of the architectures executing JPEG encoder application. The proposed method estimates the performance of SoC with less than 8% of errors comparing to the results from accurate RTL simulations.
A random-walk model is investigated and utilized to analyze the performance of a coding scheme that aims to extend the lifetime of flash memory. Flash memory is widely used in various products today, but the cells that constitute flash memory wear out as they experience many operations. This issue can be mitigated by employing a clever coding scheme that is known as a flash code. The purpose of this study is to establish a well-defined random-walk model of a flash code that is known as an index-less indexed flash code (ILIFC), and clarify the expected performance of ILIFC. Preliminary study has been made by the author for a simplified model of data operation, and the contribution of this study is to extend the model of data operation to more general and practical one. Mathematical properties of the random-walk model is reconsidered, and useful properties are derived that help analyzing the performance of ILIFC both in non-asymptotic and asymptotic scenarios.
Although code completion is inevitable for effective code editing on integrated development environments, existing code completion tools can be improved. A previous study noted that developers sometimes perform ineffective repetitions of the same code completion operations. Hence, this paper introduces a statement, “A more recently inserted code completion candidate should be given a higher rank among previously inserted items in the candidate list.” To confirm this statement, this paper examines the following three points. First, an experiment using operation histories is presented to reconfirm that developers more frequently repeat recent code completion operations. Second, a tool called RCC Candidate Sorter is presented. It alters the sorting algorithm of the default Eclipse code completion tool to prioritize candidates inserted by recent code completion. Finally, an experiment is conducted to evaluate the performance of the tool. The experimental result shows that it outperforms an existing method.
A coding pattern is a sequence of method calls and control structures, which appears repeatedly in source code. In this paper, we have extracted coding patterns of each version of ten Java programs, and then explored the number of versions in which the coding patterns appear. This paper reports the characteristics of coding patterns over versions. While learning from coding patterns is expected to help developers to perform appropriate modifications and enhancements for the software, many coding patterns are unstable as similar to the result of clone genealogy research.
In order to meet the increased computational requirement of today's consumer portable devices, heterogeneous multiprocessor system-on-chip (MPSoC) architectures have become widespread. These MPSoCs include not only multiple processors but also multiple dedicated hardware accelerators. Due to the increase complexity of the MPSoC, fast and accurate design space exploration (DSE) for best system performance at early stage of the design process is desired. Any DSE solution is desired to provide best system partitioning scheme for best performance with efficient area utilization. In this paper we propose a design space exploration framework for heterogeneous MPSoC based on tightly-coupled thread (TCT) parallel programing model which can handles system partition exploration and HW synthesis exploration. The proposed framework drastically reduces the exponential size design space into near-linear size by utilizing the accurate HW timing models as the indicator for system bottleneck and guiding the enumeration process of HW version combinations. Experimental results shows the accuracy of the proposed method with an average estimation error of 1.38% for HW timing of each thread, and 2.80% estimation error for the system-level simulation, where the simulation speedup factor was in the order of 5, 000 times. Currently the proposed framework partially depends on a high level synthesis (HLS) tool eXCite, but other HLS tools can be easily integrated into the proposed framework.
We propose a human lower body pose estimation method for team sport videos, which is integrated with tracking-by-detection technique. The proposed Label-Grid classifier uses the grid histogram feature of the tracked window from the tracker and estimates the lower body joint position of a specific joint as the class label of the multi-class classifiers, whose classes correspond to the candidate joint positions on the grid. By learning various types of player poses and scales of Histogram-of-Oriented Gradients features within one team sport, our method can estimate poses even if the players are motion-blurred and low-resolution images without requiring a motion-model regression or part-based model, which are popular vision-based human pose estimation techniques. Moreover, our method can estimate poses with part-occlusions and non-upright side poses, which part-detector-based methods find it difficult to estimate with only one model. Experimental results show the advantage of our method for side running poses and non-walking poses. The results also show the robustness of our method for a large variety of poses and scales in team sports videos.
The process of Ultra High Definition TV videos requires a lot of resources in terms of memory and computation time. In this paper we consider a block-propagation background subtraction (BPBGS) method which spreads to neighboring blocks if a part of an object is detected on the borders of the current block. This allows us to avoid processing unnecessary areas which do not contain any object thus saving memory and computational time. The results show that our method is particularly efficient in sequences where objects occupy a small portion of the scene despite the fact that there are a lot of background movements. At same scale our BPBGS performs much faster than the state-of-art methods for a similar detection quality.
Saliency maps as visual attention computational models can reveal novel regions within a scene (as in the human visual system), which can decrease the amount of data to be processed in task specific computer vision applications. Most of the saliency computation models do not take advantage of prior spatial memory by giving priority to spatial or object based features to obtain bottom-up or top-down saliency maps. In our previous experiments, we demonstrated that spatial memory regardless of object features can aid detection and tracking tasks with a mobile robot by using a 2D global environment memory of the robot and local Kinect data in 2D to compute the space-based saliency map. However, in complex scenes where 2D space-based saliency is not enough (i.e., subject lying on the bed), 3D scene analysis is necessary to extract novelty within the scene by using spatial memory. Therefore, in this work, to improve the detection of novelty in a known environment, we proposed a space-based spatial saliency with 3D local information by improving 2D space base saliency with height as prior information about the specific locations. Moreover, the algorithm can also be integrated with other bottom-up or top-down saliency computational models to improve the detection results. Experimental results demonstrate that high accuracy for novelty detection can be obtained, and computational time can be reduced for existing state of the art detection and tracking models with the proposed algorithm.
In the area of activity recognition with mobile sensors, a lot of works on context-aware systems using accelerometers have been proposed. Especially, mobile phones or remotes for video games using gesture recognition technologies enable easy and intuitive operations such as scrolling browser and drawing objects. Gesture input has an advantage of rich expressive power over the conventional interfaces, but it is difficult to share the gesture motion with other people through writing or verbally. Assuming that a commercial product using gestures is released, the developers make an instruction manual and tutorial expressing the gestures in text, figures, or videos. Then an end-user reads the instructions, imagines the gesture, then perform it. In this paper, we evaluate how user gestures change according to the types of the instruction. We obtained acceleration data for 10 kinds of gestures instructed through three types of texts, figures, and videos, totalling 44 patterns from 13 test subjects, for a total of 2,630 data samples. From the evaluation, gestures are correctly performed in the order of text→figure→video. Detailed instruction in texts is equivalent to that in figures. However, some words reflecting gestures disordered the users' gestures since they could call multiple images to user's mind.
The ubiquitous and ever-more-capable smartphones bring forth unprecedented performance in mobile computing. The pursuit of high quality mobile mobile applications and services may however compromise user privacy, which is a pivotal issue in mobile computing. In this article, we survey the state of the art on smartphone privacy, focusing on current issues, proposed methods and existing systems. We discuss the characteristics of smartphone privacy in mobile computing and then investigate a number of related work and on-going research in detecting and mitigating privacy risks in the smartphone. According to our findings, we point out future challenges of smartphone privacy in mobile computing.
Entity-centric search has become a demanding problem for many domains on the Web. In particular, the suitable contextualization of result documents poses challenges in terms of selecting most adequate indexing terms for later retrieval. This holds even more, if no generally recognized ontologies for the respective domain are available. In this paper, we show that cross-domain ontology terms are actually more useful for indexing, than salient keywords taken from the documents. Moreover, learning typical contexts for groups of entities from collections indexed by strong cross-domain ontologies can considerably improve retrieval effectiveness. Our extensive experiments prove these results on real world document collections from the area of chemistry and computer science. In fact, our evaluation in different document retrieval scenarios show a vital increase of retrieval precision of up to 87% using documents annotated with cross-domain ontology terms as compared to 53% for BM25 searches and 43% for documents annotated with Wikipedia categories.
The recent explosion in the amount of spatial data calls for specialized systems to handle big spatial data. In this paper, we survey and contrast the existing work that has been done in the area of big spatial data. We categorize the existing work in this area from three different angles, namely, approach, architecture, and components. (1) The approaches used to implement spatial query processing can be categorized as on-top, from-scratch and built-in approaches. (2) The existing works follow different architectures based on the underlying system they extend such as MapReduce, key-value stores, or parallel DBMS. (3) We also categorize the existing work into four main components, namely, language, indexing, query processing, and visualization. We describe each component, in details, and give examples of how it is implemented in existing work. At the end, we give cast studies of real applications that make use of these systems to provide services for end users.
Twitter, as one of the popular social network services, is now widely used to query public opinions. In this paper, tweets, along with the reviews collected from review websites are used to carry out sentimental analysis, so as to figure out the language-based and location-based effects on user evaluations for six global restaurants. The language expansion is carried out that 34 languages are taken into account. By using a range of new and standard features, a series of classifiers are trained and applied in the later steps of sentiment analysis. Our experimental results show that the location and language effects on user evaluations for restaurants actually exist.
Recently, there has been an increasing interest in search in time-dependent road networks where the travel time on roads depends on the time. In such a time-dependent network, the result of k Nearest Neighbor (kNN) queries, which search the k nearest neighbors (kNNs) from the specified location, depends on the query-issuing time. Therefore, existing approaches in static networks are not directly applied for kNN query in time-dependent road networks. In this paper, we propose a kNN search method to achieve a small number of visited vertexes and small response time in time-dependent road networks. In our proposed method, an index structure is constructed based on the minimum travel time on roads in the preprocessing phase. In query processing, a network is expanded by A* algorithm with referring the minimum travel time in the index until kNNs are found. An experimental result shows that our proposed method reduces the number of visited vertexes and the response time compared with an existing method.
In recent years, the scale of datacenters has become larger due to the explosive increase in the amount of digital data. As a result, the growth of energy consumption is an important factor in the management cost of datacenters. Storing and processing such large volumes of data by database applications are the core technologies in this Big Data era. However, storage accounts for a significant percentage of a datacenter's energy consumption. Therefore, we try to reduce the energy of storage to save on the total cost of datacenters. The purpose of this study is to reduce the energy consumption of storage while minimizing the deterioration of application performance. Although many methods for storage energy saving have been discussed, since it is difficult to control it efficiently only at the storage level, we have investigated the storage power control mechanism on middleware (database) layer. In this paper, we use TPC-H (a database benchmark) as an application example of data processing. We evaluate the data placement control method of storage proposed for energy saving in the database runtime processing suitable for a large-scale environment with many HDDs.
The current generation of computer hardware has brought several new challenges for the underlying software. The number of cores on a chip has grown exponentially, enabling an ever-increasing number of processes to execute in parallel. The efficient utilization of the full range of concurrent processing capabilities offered by such a multicore platform is critical to achieving good system performance. As the number of cores on a chip increases, the increasing processor-memory gap is the bottleneck for most data-intensive applications. We therefore propose a cache-efficient CARIC-DA framework for arranging the execution of concurrent database queries on multicore platforms. This achieves improved database management system (DBMS) performance by improving cache utilization for concurrent queries. Our middleware optimizes the performance of the private-cache levels by providing query-needs-aware dispatching for concurrent online transaction-processing queries to run on different processor cores. By considering both the operating system and the DBMS application, our proposal achieves higher cache utilization for various cache levels. In this paper, we demonstrate how the middleware of CARIC-DA manages a mixed workload, where complex queries with join operations cannot share data with other queries in caches. We describe strategies that enable the middleware to partition complex queries and dispatch concurrent queries to different processor cores. The performance of the extended CARIC-DA for the TPC-W benchmark is evaluated on modern Intel and AMD multicore platforms.
In this paper, we propose a method to expand XML element retrieval techniques into Web documents. XML element retrieval techniques return partial (sub) documents as search results, and are expected to be able to apply to other structured documents, namely, Web documents besides XML documents. The point is that physical document structures of Web documents are literally disorganized because Web documents are generated for not managing data but rendering on a Web browser. As another feature of Web documents, they contain many incomprehensive contents for human readers. To address challenges caused by these features, we propose 1) a reconstruction method of document structures according to logical structures of contents and 2) a filter for removing unimportant content which does not convey useful information to users. Our experimental evaluations showed that our proposed method improved search accuracy compared with both naive XML element retrieval approach and document retrieval approach.
Nowadays, many universities utilize groupware support for students to post and share their e-reports, and the students can browse and vote other students' reports in e-learning. Teachers then need to evaluate and grade all students' reports, but this will require a great deal of time and effort for a fair evaluation of the reports. Therefore, we develop an automatic scoring system for e-reports based on student peer evaluation by considering the relationship between voting and posting time of the e-reports, to promote the quality of the votes and prevent unfair votes. Then, the system provides a score ranking list of the reports based on a voting graph by analyzing the students who vote the reports, it is a grading tool to support teachers acquire the scores of the reports efficiently. Moreover, the system also enables students detect best reports easily. In this paper, we perform a student peer evaluation through groupware based on voting with a “Like” button in a course practice, and discuss an evaluation of our automatic scoring system's effectiveness compared to teachers' scoring.
The research introduced in this paper develops a semantic model whose objective is to analyze the geographical and emotion-based distribution of tweets at a large country scale. The approach extracts and categorizes tweets based on semantic orientations of terms in a dictionary, and explores their spatial and temporal distribution. Tweets are classified into different emotional classes, qualified and valued using different interval distributions that favor identification of significant trends that are compared to some of the main properties of the underlying geographical space. The whole approach is applied to a large tweets database in Japan, and illustrated by some experimental but real data that trigger some surprising and puzzling outcomes that are discussed in the paper.
We treat an image restoration problem with a Poisson noise channel using a Bayesian framework. The Poisson randomness might be appeared in observation of low contrast object in the field of imaging. The noise observation is often hard to treat in a theoretical analysis. In our formulation, we interpret the observation through the Poisson noise channel as a likelihood, and evaluate the bound of it with a Gaussian function using a latent variable method. We then introduce a Gaussian Markov random field (GMRF) as the prior for the Bayesian approach, and derive the posterior as a Gaussian distribution. The latent parameters in the likelihood and the hyperparameter in the GMRF prior could be treated as hidden parameters, so that, we propose an algorithm to infer them in the expectation maximization (EM) framework using loopy belief propagation (LBP). We confirm the ability of our algorithm in the computer simulation, and compare it with the results of other image restoration frameworks.
A major cause of traffic accidents is the driver's lack of awareness during driving. According to a current report on accident investigations, it was found that non-adaptation to environment changes is a major factor for many accidents. In the case of elderly drivers, many accidents have occurred as a result of non-pause stops at intersections. It is well-known that elderly drivers show wide individual differences in relation to driving experience, driving skill, recognition and judgment ability. Therefore, to reduce the occurrence of accidents involving elderly drivers, it is necessary to design an appropriate driving support system for individual elderly driver. It is even more important to evaluate the effect of such support system using objective methods as well as subjective evaluations. In this study, a driving assistance system that consisted of various auditory-visual alerts to encourage pausing and stopping at intersections was constructed for elderly drivers, and the effect of system was evaluated. While elderly drivers were approaching an intersection, physiological signals were measured and the changes after alarms were analyzed, which included an electro-cardiogram, cerebral blood flow and aortic pulse wave. The results showed that effects of the five kinds of alerts were compared in physiological signals and suggested to be possible to design the optimal support and assisting methods for elderly drivers with their different driving characteristics.