Homemade cooking plays a key role for a healthy and cost-efficient life. Unfortunately, preparing multiple dishes is generally time-consuming. In this paper, an algorithm is proposed to minimize the cooking time by scheduling the cooking-step of multiple dishes. The cooking procedure of a dish is divided into a sequence of six types of cooking-steps to consider the constraints in cooks and cooking utensils in a kitchen. A cooking model is presented to optimize the cooking-step schedule and estimate the cooking time for a given starting order of dishes under various constraints of cooks and utensils. Then, a high-quality schedule is sought by repeating the generation of a new order and the model application based on exhaustive search and simulated annealing. Our simulation results and cooking experiments confirm the effectiveness of our proposal.
We propose simple but efficient encapsulation architecture. In the architecture, clients can better decode Extensible Markup Language (XML) based service information for TV contents with schema digest. Our experimental results show the superiority of the proposed architecture by comparing the compression ratios and decoding times of the proposed architecture and the existing architectures.
Much research which has shown the usage of social ties could improve the location predictive performance, but as the strength of social ties is varying constantly with time, using the movement data of user's close friends at different times could obtain a better predictive performance. A hybrid Markov location prediction algorithm based on dynamic social ties is presented. The time is divided by the absolute time (week) to mine the long-term changing trend of users' social ties, and then the movements of each week are projected to the workdays and weekends to find the changes of the social circle in different time slices. The segmented friends' movements are compared to the history of the user with our modified cross-sample entropy to discover the individuals who have the relatively high similarity with the user in different time intervals. Finally, the user's historical movement data and his friends' movements at different times which are assigned with the similarity weights are combined to build the hybrid Markov model. The experiments based on a real location-based social network dataset show the hybrid Markov location prediction algorithm could improve 15% predictive accuracy compared with the location prediction algorithms that consider the global strength of social ties.
Nowadays, many individuals and organizations tend to outsource their data to a cloud storage for reducing the burden of data storage and maintenance. However, a cloud provider may be untrustworthy. The cloud thus leads to a numerous security challenges: data availability, data integrity, and data confidentiality. In this paper, we focus on data availability and data integrity because they are the prerequisites of the existence of a cloud system. The approach of this paper is the network coding-based Proof of Retrievability (POR) scheme which allows a client to check whether his/her data stored on the cloud servers are intact. Although many existing network coding-based PORs have been proposed, most of them still incur high costs in data check and data repair, and cannot prevent the small corruption attack which is a common attack in the POR scheme. This paper proposes a new network coding-based POR using the dispersal coding technique, named the ND-POR (Network coding - Dispersal coding POR) to improve the efficiency in data check and data repair and to protect against the small corruption attack.
Distributed Mobility Management (DMM) defines Internet Protocol (IP) mobility which does not depend on centralized manipulation. DMM leads to the abatement of non-optimal routing, a single point of failure, and scalability problems appearing in centralized Mobility Management (MM). The fact that most DMM schemes are in the proposal phase and non-existence of a standardization, urge to investigate the proposed schemes thoroughly to confirm their capabilities and thereby, to determine the best candidate practice for DMM. This paper examines five novel DMM proposals discussed in the Internet Engineering Task Force (IETF) using router-level Internet Service Provider (ISP) topologies of Sprint (USA), Tiscali (Europe), Telstra (AUS), and Exodus (USA), as user mobility within an ISP network is considered the most realistic and recurrent user movement in the modern scope. Results reflect behavioral differences of schemes depending on the network. ISPs closer to the Internet core with high density of Point of Presences (PoPs) such as Sprint show poorer outcome when centralized anchors/controllers are employed while Proxy Mobile IP (PMIP) based enhancements offer higher reliability. In contrast, smaller ISPs that reside farther away from the Internet core yield better performance with SDN-Based and Address Delegation schemes. Although the PMIP-Based DMM schemes perform better during handover, their outturn is trivialized due to higher latency in the data plane. In contrast, the Address Delegation and SDN-Based schemes have excessive cost and latency in performing handover due to routing table updates, but perform better in data plane, suggesting that control/data plane split may best address the optimal routing.
Risk-aware Data Replication (RDR), which replicates data at primary sites to nearby safe backup sites, has been proposed to mitigate service disruption in a disaster area even after a widespread disaster that damages a network and a primary site. RDR assigns a safe backup site to a primary site while considering damage risk for both the primary site and the backup candidate site. To minimize the damage risk of all site-pairs the Integer Programing Problem (IPP), which is a mathematical optimization problem, is applied. A challenge for RDR is to choose safe backup sites within a short computation time even for a huge number of sites. As described in this paper, we propose a Discreet method for RDR to surmount this hurdle. The Discreet method first judges the backup sites of a potentially unsafe primary site and avoids assigning a very safe primary site with a very safe backup site. We evaluated the computation time for site-paring and the data availability in the cases of Earthquake and Tsunami using basic disaster simulations. We confirmed that the computation rate of the proposed method is more than 1000 times faster than the existing method when the number of sites is greater than 1000. We also confirmed the data availability of the proposed method; it provides almost equal rates to existing methods of strict optimization. These results mean that the proposed method makes RDR more practical for massively multiple sites.
With recent developments in machine learning technology, the predictions by systems incorporating machine learning can now have a significant impact on the lives and activities of individuals. In some cases, predictions made by machine learning can result unexpectedly in unfair treatments to individuals. For example, if the results are highly dependent on personal attributes, such as gender or ethnicity, hiring decisions might be discriminatory. This paper investigates the neutralization of a probabilistic model with respect to another probabilistic model, referred to as a viewpoint. We present a novel definition of neutrality for probabilistic models, η-neutrality, and introduce a systematic method that uses the maximum likelihood estimation to enforce the neutrality of a prediction model. Our method can be applied to various machine learning algorithms, as demonstrated by η-neutral logistic regression and η-neutral linear regression.
We propose a one-step error detection and correction interface for a voice word processor. This correction interface performs analysis region detection, user intention understanding and error correction utterance recognition, all from a single user utterance input. We evaluate the performance of each component first, and then compare the effectiveness of our interface to two previous interfaces. Our evaluation demonstrates that each component is technically superior to the baselines and that our one-step error detection and correction method yields an error correction interface that is more convenient and natural than the two previous interfaces.
Studies on gaze analysis have revealed some of the relationships between viewers' gaze and their internal states (e.g., interests and intentions). However, understanding content browsing behavior in uncontrolled environments is still challenging because human gaze can be very complex; it is affected not only by viewers' states but also by the spatio-semantic structures of visual content. This study proposes a novel gaze analysis framework which introduces the content creators' point of view to understand the meaning of browsing behavior. Visual content such as web pages, digital articles and catalogs are comprised of structures intentionally designed by content creators, which we refer to as designed structure. This paper focuses on two design factors of designed structure: spatial structure of content elements (content layout), and their relationships such as “being in the same group”. The framework was evaluated with an experiment involving 12 participants, wherein the participant's state was estimated from their gaze behavior. The results from the experiment show that the use of design structure improved estimation accuracies of user states compared to other baseline methods.
Non-verbal communication incorporating visual, audio, and contextual information is important to make sense of and navigate the social world. Individuals who have trouble with social situations often have difficulty recognizing these sorts of non-verbal social signals. In this article, we propose a training tool NOCOA+ (Non-verbal COmmuniation for Autism plus) that uses utterances in visual and audio modalities in non-verbal communication training. We describe the design of NOCOA+, and further perform an experimental evaluation in which we examine its potential as a tool for computer-based training of non-verbal communication skills for people with social and communication difficulties. In a series of four experiments, we investigated 1) the effect of temporal context on the ability to recognize social signals in testing context, 2) the effect of modality of presentation of social stimulus on ability to recognize non-verbal information, 3) the correlation between autistic traits as measured by the autism spectrum quotient (AQ) and non-verbal behavior recognition skills measured by NOCOA+, 4) the effectiveness of computer-based training in improving social skills. We found that context information was helpful for recognizing non-verbal behaviors, and the effect of modality was different. The results also showed a significant relationship between the AQ communication and socialization scores and non-verbal communication skills, and that social skills were significantly improved through computer-based training.
The paper addresses a scheme of lightly supervised training of an acoustic model, which exploits a large amount of data with closed caption texts but not faithful transcripts. In the proposed scheme, a sequence of the closed caption text and that of the ASR hypothesis by the baseline system are aligned. Then, a set of dedicated classifiers is designed and trained to select the correct one among them or reject both. It is demonstrated that the classifiers can effectively filter the usable data for acoustic model training. The scheme realizes automatic training of the acoustic model with an increased amount of data. A significant improvement in the ASR accuracy is achieved from the baseline system and also in comparison with the conventional method of lightly supervised training based on simple matching.
This paper proposes an algorithm for exemplar-based image inpainting, which produces the same result as that of Criminisi's original scheme but at the cost of much smaller computation cost. The idea is to compute mean and standard deviation of every patch in the image, and use the values to decide whether to carry out pixel by pixel comparison or not when searching for the best matching patch. Due to the missing pixels in the target patch, the same pixels in the candidate patch should be omitted when computing the distance between patches. Thus, we first compute the range of mean and standard deviation of a candidate patch with missing pixels, using the average and standard deviation of the entire patch. Then we use the range to determine if the pixel comparison should be conducted. Measurements with well-known images in the inpainting literature show that the algorithm can save significant amount of computation cost, without risking degradation of image quality.
Using a flash/no-flash image pair, we propose a novel white-balancing technique that can effectively correct the color balance of a complex scene under multiple light sources. In the proposed method, by using multiple images of the same scene taken under different lighting conditions, we estimate the reflectance component of the scene and the multiple shading components of each image. The reflectance component is a specific object color which does not depend on scene illumination and the shading component is a shading effect caused by the illumination lights. Then, we achieve white balancing by appropriately correcting the estimated shading components. The proposed method achieves better performance than conventional methods, especially under colored illumination and mixed lighting conditions.
Object extraction and tracking in a video image is basic technology for many applications, such as video surveillance and robot vision. Many moving object extraction and tracking methods have been proposed. However, they fail when the scenes include illumination change or light reflection. For tracking the moving object robustly, we should consider not only the RGB values of input images but also the shape information of the objects. If the objects' shapes do not change suddenly, matching positions on the cost matrix of exclusive block matching are located nearly on a line. We propose a method for obtaining the correspondence of feature points by imposing a matching position constraint induced by the shape constancy. We demonstrate experimentally that the proposed method achieves robust tracking in various environments.
This paper presents an automatic method to track soccer players in soccer video recorded from a single camera where the occurrence of pan-tilt-zoom can take place. The automatic object tracking is intended to support texture extraction in a free viewpoint video authoring application for soccer video. To ensure that the identity of the tracked object can be correctly obtained, background segmentation is performed and automatically removes commercial billboards whenever it overlaps with the soccer player. Next, object tracking is performed by an attribute matching algorithm for all objects in the temporal domain to find and maintain the correlation of the detected objects. The attribute matching process finds the best match between two objects in different frames according to their pre-determined attributes: position, size, dominant color and motion information. Utilizing these attributes, the experimental results show that the tracking process can handle occlusion problems such as occlusion involving more than three objects and occluded objects with similar color and moving direction, as well as correctly identify objects in the presence of camera movements.
Irreversible k-conversion set is introduced in connection with the mathematical modeling of the spread of diseases or opinions. We show that the problem to find a minimum irreversible 2-conversion set can be solved in O(n2log 6n) time for graphs with maximum degree at most 3 (subcubic graphs) by reducing it to the graphic matroid parity problem, where n is the number of vertices in a graph. This affirmatively settles an open question posed by Kyncl et al. (2014).
A 2-directional orthogonal ray graph is an intersection graph of rightward rays (half-lines) and downward rays in the plane. We show a dynamic programming algorithm that solves the weighted dominating set problem in O(n3) time for 2-directional orthogonal ray graphs, where n is the number of vertices of a graph.
We propose an energy-efficient real-time scheduling algorithm based on T-L Plane abstraction. The algorithm is designed to exploit Dynamic Power Management and generates a new event called event-s to render longer idle intervals, which increases the chances of switching a processor to the sleep mode. We compare the proposed algorithm with previous work and show that it is effective for energy management.
In these days, recognizing a user personality is an important issue in order to support various personalized services. Besides the conventional phone usage such as call logs, SMS logs and application usages, smart phones can gather the behavior of users by polling various embedded sensors such as GPS sensors. In this paper, we focus on how to predict user attitude based on GPS log data by applying location clustering techniques and extracting features from the location clusters. Through the evaluation with one month-long GPS log data, it is observed that the location-based features, such as number of clusters and coverage of clusters, are correlated with user attitude to some extent. Especially, when SVM is used as a classifier for predicting the dichotomy of user attitudes of MBTI, over 90% F-measure is achieved.
In this Letter, a new iris recognition approach based on local Gabor orientation feature is proposed. On one hand, the iris feature extraction method using the traditional Gabor filters can cause time-consuming and high-feature dimension. On the other hand, we can find that the changes of original iris texture in angle and radial directions are more obvious than the other directions by observing the iris images. These changes in the preprocessed iris images are mainly reflected in vertical and horizontal directions. Therefore, the local directional Gabor filters are constructed to extract the horizontal and vertical texture characteristics of iris. First, the iris images are preprocessed by iris and eyelash location, iris segmentation, normalization and zooming. After analyzing the variety of iris texture and 2D-Gabor filters, we construct the local directional Gabor filters to extract the local Gabor features of iris. Then, the Gabor & Fisher features are obtained by Linear Discriminant Analysis (LDA). Finally, the nearest neighbor method is used to recognize the iris. Experimental results show that the proposed method has better iris recognition performance with less feature dimension and calculation time.
Aperture synthesis technology represents an effective approach to millimeter-wave radiometers for high-resolution observations. However, the application of synthetic aperture imaging radiometer (SAIR) is limited by its large number of antennas, receivers and correlators, which may increase noise and cause the image distortion. To solve those problems, this letter proposes a compressive regularization imaging algorithm, called CRIA, to reconstruct images accurately via combining the sparsity and the energy functional of target space. With randomly selected visibility samples, CRIA employs l1 norm to reconstruct the target brightness temperature and l2 norm to estimate the energy functional of it simultaneously. Comparisons with other algorithms show that CRIA provides higher quality target brightness temperature images at a lower data level.
Blur is one of the most common distortion type and greatly impacts image quality. Most existing no-reference (NR) image blur metrics produce scores without a fixed range, so it is hard to judge the extent of blur directly. This letter presents a NR perceptual blur metric using Saliency Guided Gradient Similarity (SGGS), which produces blur scores with a fixed range of (0,1). A blurred image is first reblurred using a Gaussian low-pass filter, producing a heavily blurred image. With this reblurred image as reference, a local blur map is generated by computing the gradient similarity. Finally, visual saliency is employed in the pooling to adapt to the characteristics of the human visual system (HVS). The proposed metric features fixed range, fast computation and better consistency with the HVS. Experiments demonstrate its advantages.
Collective motion stems from the coordinated behaviors among individuals of crowds, and has attracted growing interest from the physics and computer vision communities. Collectiveness is a metric of the degree to which the state of crowd motion is ordered or synchronized. In this letter, we present a scheme to measure collectiveness via link prediction. Toward this aim, we propose a similarity index called superposed random walk with restarts (SRWR) and construct a novel collectiveness descriptor using the SRWR index and the Laplacian spectrum of a network. Experiments show that our approach gives promising results in real-world crowd scenes, and performs better than the state-of-the-art methods.