Products and services nowadays need personal information from consumers in order to personalize their goods to best fit consumers. At the present, the online environment is the biggest source of consumers' personal information. However, online privacy has become the major concern of consumers. A personal information trading platform has been proposed as a medium for collecting consumers' personal information in exchange for monetary incentive. This study proposes a new approach to requesting personal attributes which can adapt with consumers' personal information disclosure behavior and aims to increase the disclosure of personal information without increasing of monetary incentive. To develop this new adaption method, we developed the valuation of a personal information method without using currency. The probability and graph mining techniques were used to valuating personal attributes. Then, we displayed the relationships of personal attributes disclosure in the hierarchy and proposed a method for valuating personal information disclosure. The valuation method was used in the evaluations, which were compared with the disclosure of personal information results from the consumers. After the evaluation was completed, the result showed that the new approach can significantly increase the disclosure of consumers' personal information.
A use of an electric outlet by a consumer forces the outlet manager to pay for the consumer's power usage in current electrical power systems. Even if a consumer uses an outlet managed by another person, one bill for both indoor and outdoor charging information should be required to the consumer in their contract with the utility company. For this purpose, we define a model for the Smart Grid security and propose a Secure Payment Protocol for Charging Information over Smart grid, SPaCIS for short, as a protocol satisfying the model. Our model provides for the unlinkability of consumers as well as for the undeniability and unforgeability of billing information using digital signatures and identity federations. SPaCIS is also efficient in the sense that time complexity is constant relatively to a trivial use such as an individual verification for each signatures, unless a verification error happens. We furthermore evaluate performance of SPaCIS via cryptographic implementation, and simulate SPaCIS in a case that one thousand users generate thirty signatures. Then, we show that SPaCIS with ECDSA can be executed within 6.30msec for signing and 21.04msec for verification of signatures, and conclude that SPaCIS is fairly practical.
Technological development in communications and electronics has made the growing expansion of the Internet of Things (IoT). IoT is expected to make a great impact to our society because smart devices in IoT are easily integrated into existing service. As a result, standardization of technologies to support the IoT is becoming more important to realize a smart society through different service domains. This paper presents a survey on the current state of the art of standards for IoT technologies and gives a brief introduction to related standards and recent research areas in IoT. Finally, it also proposes an idea of the future platform of scalable IoT systems. The proposed idea employs IP mobility technologies to realize inter-operability among IoT devices in different networks.
As an innovation of driver assistance technology, this research aims to develop an “Autonomous Intelligent Driving System” to prevent risk of accidents and enhance driving safety for elderly drivers in order to vitalize current aged society. The proposed system focuses on two key technologies: Risk-predictive driving intelligence model and Shared control between the driver and the assistance system. The first key technology is to embed an experienced driver model for recovering degraded performances of recognition, decision-making and operation of drivers. In the driver assistance system design, the experienced driver model contains knowledge-based “risk-prediction mechanism” to avoid accidents in risky driving situations. For instance, when passing unsignalized intersections with poor visibility, it is known that experienced drivers predict the appearance of sudden-crossing pedestrians or bicycles and then slow down the vehicle when approaching such poor visibility area and also prepare to brake in order to avoid potential collisions that might occur. The second key point is “Shared control.” This research does not aim to develop a fully-autonomous driving vehicle for them, but aims to develop an advanced driver assistance system for preventing accidents in the case that the intervention by braking or steering is needed, as well as reducing driving workload. Therefore, to realize good cooperative characteristics between the driver and the system, the shared control concept is applied to optimize the assistance level for braking and steering maneuver, minimizing the interference human driver driving maneuver. The Driving Simulator and the test vehicle are used to verify the effectiveness of the proposed intelligent driving system.
This article discusses a novel method to strengthen the collaboration between Internet service providers (ISPs) and content delivery networks (CDNs). CDNs are becoming the primary data delivery method in information communication technology environments because information sharing via networks is becoming the driving force of the future Internet. Moreover, it is anticipated that network routers will be equipped with additional processing power and storage modules for providing efficient end-user services. Consequently, this article studies the effectiveness of introducing a Service-oriented Router (SoR) to strengthen the ISP-CDN collaboration to leverage DNS-based request redirection in CDNs. In contrast, the proposed method yields better performance in user redirection and network resource utilization, suggesting that using SoR may a future business model which addresses adequate ISP-CDN collaboration.
Recently, cloud systems composed of heterogeneous hardware have been increased to utilize progressed hardware power. However, to program applications for heterogeneous hardware to achieve high performance needs much technical skill and is difficult for users. Therefore, to achieve high performance easily, this paper proposes a PaaS which analyzes application logics and offloads computations to GPU and FPGA automatically when users deploy applications to clouds.
There have been several studies on object detection and activity recognition on a table conducted thus far. Most of these studies use image processing with cameras or a specially configured table with electrodes and an RFID reader. In private homes, methods using cameras are not preferable since cameras might invade the privacy of inhabitants and give them the impression of being monitored. In addition, it is difficult to apply the specially configured system to off-the-shelf tables. In this work, we propose a system that recognizes activities conducted on a table and identifies which user conducted the activities with load cells only. The proposed system uses four load cells installed on the four corners of the table or under the four legs of the table. User privacy is protected because only the data on actions through the load cells is obtained. Load cells are easily installed on off-the-shelf tables with four legs and installing our system does not change the appearance of the table. The results of experiments using a table we manufactured revealed that the weight error was 38g, the position error was 6.8cm, the average recall of recognition for four activities was 0.96, and the average recalls of user identification were 0.65 for ten users and 0.89 for four users.
In this paper we consider the (legal) representative in governmental ICT services and propose a secure private mail box system in which a message sent to the pupil is re-encrypted by the proxy server. This process enables the representative to decrypt the message. We also show its formal description of the protocols and evaluate the security by ProVerif model checking tool.
We describe a method for decentralized task/area partitioning for coordination in cleaning/sweeping domains with learning to identify the easy-to-dirty areas. Ongoing advances in computer science and robotics have led to applications for covering large areas that require coordinated tasks by multiple control programs including robots. Our study aims at coordination and cooperation by multiple agents, and we discuss it using an example of the cleaning tasks to be performed by multiple agents with potentially different performances and capabilities. We then developed a method for partitioning the target area on the basis of their performances in order to improve the overall efficiency through their balanced collective efforts. Agents, i.e., software for controlling devices and robots, autonomously decide in a cooperative manner how the task/area is partitioned by taking into account the characteristics of the environment and the differences in agents' software capability and hardware performance. During this partitioning process, agents also learn the locations of obstacles and the probabilities of dirt accumulation that express what areas are easy to be dirty. Experimental evaluation showed that even if the agents use different algorithms or have the batteries with different capacities resulting in different performances, and even if the environment is not uniform such as different locations of easy-to-dirty areas and obstacles, the proposed method can adaptively partition the task/area among the agents with the learning of the probabilities of dirt accumulations. Thus, agents with the proposed method can keep the area clean effectively and evenly.
Increasing the size of parallel corpora for less-resourced language pairs is essential for machine translation (MT). To address the shortage of parallel corpora between Chinese and Japanese, we propose a method to construct a quasi-parallel corpus by inflating a small amount of Chinese-Japanese corpus, so as to improve statistical machine translation (SMT) quality. We generate new sentences using analogical associations based on large amounts of monolingual data and a small amount of parallel data. We filter over-generated sentences using two filtering methods: one based on BLEU and the second one based on N-sequences. We add the obtained aligned quasi-parallel corpus to a small parallel Chinese-Japanese corpus and perform SMT experiments. We obtain significant improvements over a baseline system.
In eye-tracking-based reading behavior research, gaze sampling errors often negatively affect gaze-to-word mapping. In this paper, we propose a method for more accurate mapping by first taking adjacent horizontally progressive fixations as segments, and then classifying the segments into six classes using a random forest classifier. The segments are then reconstructed based on the classification, and are associated with a document line using a dynamic programming algorithm. The combination of segment-to-line mapping and transition classification achieved 87% mapping accuracy. We also witnessed a reduction of manual annotation time when the mapping was used as an annotation guiding tool.
Data stream management systems (DSMSs) are suitable for managing and processing continuous data at high input rates with low latency. For advanced driver assistance including autonomous driving, embedded systems use a variety of onboard sensor data with communications from outside the vehicle. Thus, the software developed for such systems must be able to handle large volumes of data and complex processing. We develop a platform that integrates and manages data in an automotive embedded system using a DSMS. However, because automotive data processing, which is distributed in in-vehicle networks of the embedded system, is time-critical and must be reliable to reduce sensor noise, it is difficult to identify conventional DSMSs that meet these requirements. To address these new challenges, we develop an automotive embedded DSMS (AEDSMS). This AEDSMS precompiles high-level queries into executable query plans when designing automotive systems that demand time-criticality. Data stream processing is distributed in in-vehicle networks appropriately, where real-time scheduling and senor data fusion are also applied to meet deadlines and enhance the reliability of sensor data. The main contributions of this paper are as follows: (1) we establish a clear understanding of the challenges faced when introducing DSMSs into the automotive field; (2) we propose an AEDSMS to tackle these challenges; and (3) we evaluate the AEDSMS during run-time for advanced driver assistance.
This paper addresses the issues in the task of annotating geographical entities on microblogs and reports the preliminary results of our efforts to annotate Japanese microblog texts. Unlike prior work, we aim at annotating not only geographical location entities but also facility entities, such as stations, restaurants and schools. We discuss (i) how to build a gazetteer of geographical entities with a sufficiently broad coverage, (ii) what types ambiguities that need to be considered, (iii) why the annotator tends to disagree, and (iv) what technical problems should be addressed to automate the task of annotating the geographical entities. All the annotation data and the annotation guidelines are publicly available for research purposes from our web site.
In recent years, virtual and augmented reality have begun to take advantage of the high speed capabilities of data streaming technologies and wireless networks. However, limitations like bandwidth and latency still prevent us from achieving high fidelity telepresence and collaborative virtual and augmented reality applications. Fortunately, both researchers and engineers are aware of these problems and have set out to design 5G networks to help us to move to the next generation of virtual interfaces. This paper reviews state of the art virtual and augmented reality communications technology and outlines current efforts to design an effective, ubiquitous 5G network to help to adapt to virtual application demands. We discuss application needs in domains like telepresence, education, healthcare, streaming media, and haptics, and provide guidelines and future directions for growth based on this new network infrastructure.
Telecommunication service has been growing and progressing from telephone to high reality communication systems that are based on evolution of network and media technologies. Recognizing virtual reality (VR) as a communication tool, we provide a review of communication services and the directions they are moving in, as well as related VR technologies. The Immersive Telepresence System “Kirari!” is also introduced as the latest development example for a new telecommunication service.
The research and development (R&D) and the standardization of the 5th Generation (5G) mobile networking technologies are proceeding at a rapid pace all around the world. In this paper, we introduce the emerging concept of network slicing that is considered one of the most significant technology challenges for 5G mobile networking infrastructure, summarize our preliminary research efforts to enable end-to-end network slicing for 5G mobile networking, and finally discuss application use cases that should drive the designs of the infrastructure of network slicing.
This paper first investigates how a network operates when multiple receivers download content simultaneously in content-centric networking (CCN) when the receivers' downloading speeds differ. The results indicate that the performance of the download completion time of a faster user degrades excessively due to a decrease in the cache-hit rate in the router. Based on the investigation, this paper proposes a novel in-network caching method for simultaneous download from multiple receivers in CCNs. The proposed method keeps cached data packets in a router until slower receivers download the data, in order to prevent slower users from directly downloading data from the content provider. We conduct computer simulations and confirm the effectiveness of the proposed method. We show that the proposed method can improve the download completion time performance in the situation where multiple receivers download content at different speeds in CCN.
Wireless Mesh Networks (WMNs) over CSMA MAC (especially IEEE 802.11) are an attractive solution to widen the coverage area of the Internet in unlicensed frequency bands. Although such CSMA-based WMNs have been deeply investigated for a long time, they still suffer from heavy interference due to hidden terminals. In this paper, we accelerate the performance of CSMA-based WMNs by introducing a distributed scheduling scheme that exchanges the transmission-queue length information in real-time among neighbor nodes. In our scheduling scheme, we exchange the information of transmission queue-lengths among neighbor nodes in real-time, and allow transmitting frames to the node that has the longest queue length among its 2-hop distance. The proposed scheduling scheme can be regarded as a distributed design of so called ‘Max-weight’ scheduling.By combining CSMA with the queue-length based scheduling, we significantly reduce collisions due to hidden terminals and improve the performance with a small overhead of queue-length fields in MAC frames.
Fine-grained network traffic monitoring is important for efficient network management in software-defined networking (SDN). The current SDN architecture, i.e., OpenFlow, relies on counters in the flow entries of forwarding tables for such monitoring tasks. This is not efficient nor flexible since the packet-header fields that users aim for monitoring are not always the same or overlap with those in OpenFlow match fields, which is designed for forwarding as a higher priority. This inflexibility may result in unnecessary flow entries added to switches for monitoring and controller-switch monitoring-based communication overhead, which may cause the communication channel to become a bottleneck, especially when the network includes a large number of switches. We propose SDN-Mon, a SDN-based monitoring framework that decouples monitoring from existing forwarding tables, and allows more fine-grained and flexible monitoring to serve a variety of network-management applications. SDN-Mon allows the controller to define the arbitrary sets of monitoring match fields based on the requirements of controller applications to flexibly monitor traffic. In SDN-Mon, some monitoring processes are selectively delegated to SDN switches to leverage the computing processor of the switch and avoid an unnecessary overhead in the controller-switch communication for monitoring. We implemented SDN-Mon and evaluated its performance on Lagopus switch, a high-performance software switch.
To improve the communication performance in IEEE802.11-based wireless mesh networks (WMNs), several dynamic metrics have been proposed. However, all of them have a severe risk of generating temporary routing loops which may cause severe congestion and disruption of communications. Although the routing loop is an essential problem that degrades network performance, no essential solution is provided so far for wireless multihop networks. In this paper, we propose a mechanism called Loop-free Metric Range (LMR) to make existing dynamic metrics loop-free by restricting the range of metric values to change. LMR is applicable to a major part of existing metrics including ETX, ETT, MIC, etc. without any message overhead. We first provide theoretical results that shows LMR guarantees loop-freedom if no message loss takes place. We next show that LMR is also practically effective in practical scenarios where message loss may take place; we show through simulation and actual evaluations that LMR works effectively as a limiter on dynamic metrics to reduce routing loops and to improve network performance through similation and real evaluation.
With the dramatic increase in Internet of Things (IoT) related messaging volume, message queue systems are highly required for both interoperability among devices, as well as for control message traffic between devices and heterogeneous back-end systems (BES). When connected BES issue several dequeue requests to the message queue and no message is available, the frequency of missed-dequeues increases, which causes a degradation of the maximum throughput. Therefore, we propose the retry dequeue-request scheduling (RDS) method that decreases the number of dequeue requests from the BES by delaying the replies to the BES when missed-dequeues occur. Simulation and experimental evaluations show that the throughput of the RDS method achieves 180% of that of the conventional dequeue method.
This paper proposes a novel data compression method for artificial vision systems and its low-energy implementation in order to reduce energy consumption in a wireless communication subsystem. The artificial vision systems are one of the methods for realizing visual prosthesis by controlling stimulus to visual nerves, and they consist of an inner stimulating unit and an outer image processing unit. The outer unit transmits information regarding stimulation to the inner unit via wireless communication, which occupies a large portion of the whole energy consumption. Reducing traffic in wireless communication is important to prevent damage caused by extra heat dissipation of the inner unit, which leads to excess energy consumption. The proposed compression method marks a higher compression ratio than the conventional compression methods by taking advantage of the analyses of stimuli position data, which is dominant in traffic. The proposed method is implemented as an application-domain specific instruction-set processor to achieve both configurability of stimulation control and compression efficiency. The evaluation results show that the proposed implementation reduces energy consumption by about 87% and 62% in the compression and decompression process, respectively. These results indicate that the proposed method can expect to reduce energy consumption in a wireless communication receiver dramatically.
This paper presents a security analysis of the Local Interconnect Network (LIN) that is used in assembly units such as seats, steering wheels, and doors in vehicles. Recently, the number of security threats to in-vehicle networks such as the Controller Area Network has increased. In contrast, there have been no reports that evaluate the security of LIN in detail. The security analysis of LIN is important because it is used in units related to seats, steering wheels, etc. and it is at risk for an attack. In this paper, we present the first evaluation on the security of LIN. We present case studies of attacks that use the characteristics of a commonly-used error handling mechanism. In the attacks, the attacker intentionally stops communication using the error handling mechanism and sends a false response in place of a valid one. We experimentally show the feasibility of the attacks using a vehicle microcontroller. Furthermore, we present countermeasures against the attacks. The results of this study show that there is vulnerability to attack when the error handling mechanism is simply designed. We believe that this study will contribute to improvements in security of in-vehicle communications.
A significant number of logs are generated in dynamic malware analysis. Consequently, a method for effectively compressing these logs is required to reduce the amount of memory and storage consumed to store such logs. In this study, we evaluated the efficacy of grammar compression methods in compressing call traces in malware analysis logs. We hypothesized that grammar compression can be useful in compressing call traces because its algorithm can naturally express the dynamic control flows of program execution. We measured the compression ratio of three grammar compression methods (SEQUITUR, Re-Pair, and Byte Pair Encoding (BPE)) and three well-known compressors (gzip, bzip2, and xz). In experiments conducted in which API call sequences collected from thousands of Windows malware were compressed, the Re-Pair grammar compression method was found to outperform both gzip and bzip2.
Recently, Delay Tolerant Networks (DTNs) have been intensively researched to overcome unstable communication due to the intermittent link connection in wireless communications. In wireless DTNs, to enable continuous connectivity, data are exchanged through intermediate nodes in the path toward the destination node by store-and-forward approach. However, since the participating nodes in the network are not fully trusted, a secure data exchange mechanism in the DTNs would be strongly desirable. In this paper, we propose a secure data exchange system in the wireless DTNs using Attribute-Based Encryption (ABE) to provide two properties: (i) content data can be accessed by only authorized nodes that are dynamically defined by a policy on the attributes while keeping its integrity from alteration during transmission, and (ii) routing messages are encrypted and authenticated such that only the attribute-based authorized nodes can exchange the routing messages, where multi-hop routing messages are encrypted and authenticated by the ABE. Our experimental results show the practicality of our system.
Relaxed memory consistency models specify effects of executions of statements among threads, which may or may not be reordered. Such reorderings may cross loop iterations. To the best of our knowledge, however, there exists no concurrent program logic which explicitly handles the reorderings across loop iterations. This paper provides concurrent program logic for relaxed memory consistency models that can represent, for example, total store ordering, partial store ordering, relaxed memory ordering, and acquire and release consistency. There are two novel aspects to our approach. First, we translate a concurrent program into a family of directed acyclic graphs with finite nodes and transitive edges called program graphs according to a memory consistency model that we adopt. These represent dependencies among statements which represent reorderings of not only statements but also visibility of their effects. Second, we introduce auxiliary variables that temporarily buffer the effects of write operations on shared memory, and explicitly describe the reflections of the buffered effects to shared memory. Specifically, we define a small-step operational semantics for the program graphs with the introduced auxiliary variables, then define sound and relatively complete logic to the semantics.
This paper proposes a parallel implementation of graph mining that extracts all connected subgraphs with common itemsets, of which the size is not less than a given threshold, from a graph and from itemsets associated with vertices of the graph, in distributed memory environments using the task-parallel language Tascell. With regard to this problem, we have already proposed parallelization of a backtrack search algorithm named COPINE and its implementation in shared memory environments. In this implementation, all workers share a single table, which is controlled by locks, that contains the knowledge acquired during the search to obviate the need for unnecessary searching. This sharing method is not practical in distributed memory environments because it would lead to a drastic increase in the cost of internode communications. Therefore, we implemented a sharing method in which each computing node has a table and sends its updates to the other nodes at regular time intervals. In addition to this, the high task creation cost for COPINE is problematic and thus the conventional work-stealing strategy in Tascell, which aims to minimize the number of internode work-steals, significantly degrades the performance since it increases the number of intranode work-steals for small tasks. We solved this problem by promoting workers to enable them to request tasks from external nodes. We also employed a work-stealing strategy based on estimation of the sizes of tasks created by victim workers. This approach enabled us to achieve good speedup performance with up to 8 nodes × 16 workers.
In this paper, the author proposes an Energy-on-Demand (EoD) system based on combinatorial optimization of appliance power consumptions, and describes its implementation and evaluation. EoD is a novel power network architecture of demand-side power management, whose objective is to intelligently manage power flows among power generations under the limitation of available power resource. In an EoD system, when total power consumption exceeds the limit of power resource, a power allocation manager deployed in the system decides the optimal power allocation to all the appliances based on their importance and power consumptions, and controls the amount of power supplied to the appliances in a way that causes minimum undesired effect to quality-of-life of users. Therefore, one of the most crucial factors in an EoD system is the strategy for deciding the optimal power allocation. From a mathematical viewpoint, the power allocation management in an EoD system can be considered as an optimization problem of appliance operation modes. In the developed system, power allocation is based on the multiple-choice knapsack problem (MCKP), a kind of combinatorial optimization problem. The system measures power consumption of appliances, computes the optimal power allocation based on an algorithm for the MCKP, and realizes computed power allocation by controlling IR-controllable appliances and mechanical relays. Through experiments, the developed system is confirmed to work properly as an EoD system by observing system behaviors when the total power consumption exceeds the upper limit of the available power resource.
FESTIVAL EU-Japan collaborative project aims at federating existing Smart ICT testbeds of different nature to provide a platform for developing and testing emergent Smart ICT services. The federation of testbeds covering heterogeneous domains has been a great challenge and FESTIVAL provides a uniform access to different resources, such as Open Data resources, IoT devices, IT resources and Living Labs. In this paper, design and implementation of the current FESTIVAL platform are introduced with approaches to federate and interoperate existing resources. Integration of all the components, including the existing testbeds, will also be described to finalize and validate the federation.
WEP has serious vulnerabilities, and they cause various key recovery attacks. Although a more secure protocol such as WPA2 is recommended, according to each research by IPA and Keymans NET, WEP is still widely used because of the lack of knowledge about security of the wireless LAN. On the other hand, it takes large costs to replace a wireless LAN equipment in large-scale facilities. They need a secure method which can be used on their equipment by updating the firmware of WEP. In 2011, Morii, one of us, et al. showed IVs which prevented the Klein attack, the PTW attack, and the TeAM-OK attack. However, they did not present how to obtain such IVs and evaluate security of them. This paper shows the secure method of WEP and how to use it as fast as WEP. We show an IV which prevents the establishment of previous key recovery attacks. Moreover, we show how to use our IV efficiently on the operation of WEP. Our method requires about 1.1 times the processing time for the encryption than WEP. As a result, our method can prevent previous key recovery attacks and realize communication as fast as WEP.
Open classes are frequently used in programming languages such as Ruby and Smalltalk to add or change methods of a class that is defined in the same component or in a different one. They are typically used for bug fixing, multi-dimensional separation of concerns, or to modularly add new operations to an existing class. However, they suffer from modularity issues if globally visible: Other components using the same classes are then affected by their modifications. This work presents Extension Classes, a hierarchical approach for dynamically scoping such modifications in Ruby, built on top of ideas from Context-oriented Programming (COP). Our mechanism organizes modifications in classes and allows programmers to define their scope according to a class nesting hierarchy and based on whether programmers regard an affected class as a black box or not. Moreover, Extension Classes support modularizing modifications as mixins, such that they can be reused in other components.
Haskell is a functional language featuring lazy evaluation and referential transparency. On one hand, Referential transparency is useful for parallel computing because the results do not depend on the evaluation order, but on the other hand, parallel computing requires an evaluation order that is different from that of lazy evaluation. There are some parallel programming libraries for Haskell, such as Repa (regular parallel arrays) and Accelerate. However, little research has been conducted on evaluation with real applications, and the usefulness of these libraries remains unclear. In this study, we evaluated the usefulness of parallel programming libraries for Haskell with an application that applies a super-resolution technique to fMRI images. We developed a CPU-based parallel program with Repa and GPU-based parallel program with Accelerate and compared their performance. We obtained reasonable speedups for the program with Repa, but not for the program with Accelerate. We also investigated Accelerate's performance issues with an implementation in C and CUDA and the log from the Accelerate program. In this paper, we report our findings through a case study, focusing on the advantages and difficulties in parallel program development with Haskell.
Recently, progress has been made in IoT technologies and applications in the maintenance area are expected. However, IoT maintenance applications are not widespread in Japan yet because of the one-off solution of sensing and analyzing for each case, the high cost collecting sensing data and insufficient maintenance automation. This paper proposes a maintenance platform which analyzes sound data in edges, analyzes only anomaly data in cloud and orders maintenance automatically.
Nearby event data, such as those for exhibitions and sales promotions, may help users spend their free time more efficiently. However, most event data are hidden in millions of webpages, which is very time-consuming for a user to find such data. To address this issue, we use web mining that extracts event data from webpages. In this paper, we propose and discuss the implementation of Event.Locky - a system for extracting event data from webpages in a user-defined area and displaying them to a user in a spatial-temporal structure. Furthermore, we design two core algorithms for event data extraction in Event.Locky: webpage-data-record extraction and event-record classification. The former is used to convert a semi-structural HTML document into processable structured data. The latter filters out non-event data from extracted data records using machine learning. We trained and evaluated Event.Locky with an actual dataset composed by 96 restaurants and shops at Nagoya train station. As a result, our event-classification algorithm achieved an F1 score of 91.61%, an increase of 3.07% from current event-classification algorithms. The combination of our event-classification algorithm and our data-record-extraction algorithm achieved F1 score 83.96% to extract event records from webpages. That increased 1.6% from current algorithm. Finally, we discuss the feasibility of Event.Locky in an actual online environment through the implementation of a demonstration application.
We propose a method for gesture recognition that utilizes active acoustic sensing, which transmits acoustic signals to a target, and recognizes the target's state by analyzing the response. In this study, the user wore a contact speaker that transmitted ultrasonic sweep signals to the user's body and a contact microphone that detected the ultrasound propagated through the body. The propagation characteristics of the ultrasound changed depending on the user's movements. We utilized these changes to recognize the user's gestures. One of the important novelty features of our method is that the user's gestures can be acquired not only from the physical movement but also from the user's internal state, such as muscle activity, since ultrasound is transmitted via both the user's internal body and body surface. Moreover, our method is not adversely affected by audible-range sounds generated by the environment and body movements because we utilize ultrasound. We implemented a device that uses active acoustic sensing to effectively transmit/detect the ultrasound to/from the body and investigated the performance of the proposed method in 21 contexts with 10 subjects. The evaluation results confirmed that the precision and recall are 93.1% and 91.6%, respectively when we set 10% of the data as training data and the rest as testing data in the same data set. When we used the data set for training and the other data set for testing in the same day, the precision and recall are 51.6% and 51.3%, respectively.
Wikification is the task of connecting mentions in texts to entities in a large-scale knowledge base, Wikipedia. In this paper, we present a pipeline system for Japanese Wikification that consists of two components, namely candidate generation and candidate ranking. We investigate several techniques for each component, using a recently developed Japanese Wikification corpus. For candidate generation, we find that a name dictionary using anchor texts of Wikipedia is more effective than other methods based on similarity of surface forms. For candidate ranking, we verify that a set of features used in English Wikification is effective in Japanese Wikification as well. In addition, by using a corpus that links mentions to Japanese Wikipedia entries instead of to English Wikipedia entries, we are able to acquire rich contextual information from Japanese Wikipedia articles, which leads to improvements for Japanese mention disambiguation. We take this advantage by exploring several embedding models that encode context information of Wikipedia entities. The experimental results demonstrate that they improve candidate ranking. We also report the effect of each feature in detail. To sum, our system achieves 81.60% accuracy, significantly outperforming the previous work.
In this paper, we propose a novel way to encourage visitors to share their experiences and interests in exhibition spaces. Visitors may have experiences in an exhibition and become aware of meanings of the exhibits and/or relationship among them. We believe that sharing the experiences of visitors will enhance the exhibition experience for subsequent visitors because shared experiences may include fascinating topics. To acquire experiences of the visitors, we used “PhotoChat,” which is an in-house photo communication software. PhotoChat is capable of communicating with others by taking photographs and adding annotations to each photograph. It also records the locations for coordination between the photographs and the statistical information contained in the annotations. Since PhotoChat is designed for realtime communication, in this study, we introduce a robot that inhabits the exhibition space. The robot is always on the PhotoChat and acquires all data on PhotoChat. The robot, thus, is capable to know what a visitor communicate with others on PhotoChat and to share them with subsequent visitors. The robot can also use bodily actions to express instructions to the visitors. We developed a system that integrates PhotoChat into a robot. We also implemented robot behavior (i.e., bodily actions and motions) that includes recommendations for photographs taken by others. That is, the robot communicates with human using both PhotoChat and its body. We held workshops to perform data collection and manually classified the data into three content categories. We then performed experiments using the developed system to distribute the classified content. The results showed that the robot's physical behaviors encouraged conversations between the visitors based on provided topics.
Nonverbal information plays an important role to convey feelings and/or interests of the people in conversations. Since Bibliobattle, a book-review game, has pleasant features to investigate non-verbal information on conversation settings, we conduct a series of experiments on Bibliobattle settings. In Bibliobattle, each speaker presents his/her own recommended book to listeners as a bibliobattler in 5 minutes. At the end of all presentations, everyone votes for the champion book. We analyzed a series of Bibliobattle experiments by video investigation. In the analysis, we focused on the listeners' non-verbal information, in particular, nods, laughs and change postures. Our results showed that there are co-occurrence of nonverbal action among the audience in Bibliobattles. The frequency of co-occurrence of positive non-verbal information were assumed to be excitement of the presentation. However, interestingly, the results showed that the frequency does not affect the result of voting for the champion book in Bibliobattle. We discuss the cause of the results in the paper.
Fast similarity searches that use high-dimensional feature vectors for a vast amount of multi-media data have recently become increasingly important. However, ordinary similarity searches are slow because they require a large number of floating-point operations that are proportional to the number of record data. Many studies have been done recently that propose to speed up similarity searches by converting feature vectors to bit vectors. Such similarity searches are regarded as approximations of the similarity searches over the original data. However, some of those approximations are not theoretically guaranteed since no direct approximate relations between the Euclidean and Hamming distances are given. We propose a novel hashing method that utilizes inverse-stereographic projection and gives a direct approximate relation between the Euclidean and Hamming distances in a closed-form expression. Although some studies have discussed the relationship between the two distances, to the best of our knowledge, our hashing method is the first one to give a direct approximate relation between the two distances. We also propose parameter values that are needed for our proposal method. Furthermore, we show through experiments that the proposed method has more accurate approximation than the existing random projection-based and Hamming distance-based methods for many datasets.
We present a matching method for 3D CAD assembly models consisting of multiple components. Our method discriminates not only the global shapes of the models, the numbers and kinds of their components but also the geometric layouts of the components. In order to identify the components constituting an assembly model, different numerical values such as positive integers are assigned initially to them. The same value is assigned to the same kind of components in an assembly model. However, these initially assigned values to the components vary with assembly models as often happens in practical applications. We represent an assembly model as a set of feature quantities which are computed using projections for each of the components from various angles. The similarity between two assembly models is computed from the similarities between their feature quantities. In order to make the projections reflect the layout of components in the whole assembly structure, we propose a way of reassigning numerical values to the components. This reassignment also makes the feature quantities of assembly models independent of the initially assigned values to their components. Using 3D CAD assembly models with different layouts of components, we show the effectiveness of the proposed method experimentally.
Users' visiting patterns to POIs (Points-Of-Interest) varied with regard to the users' familiarity with their visited areas. For instance, users visit tourist sites in unfamiliar cities rather than in their familiar home city. Previous studies have shown that familiarity can improve POI recommendation performance. However, such studies have focused on the differences between home and other cities, and not among small urban neighborhoods in the same city where user activities frequently occur. Applying the studies directly to the areas is difficult because simple distance-based familiarity measures, or visit-pattern differences represented on topics, groups of POIs that share common functions such as Arts, French restaurants, are too coarse for capturing the differences observed among different areas. In the urban neighborhoods in the same city, user visit-pattern differences originate from more precise POI levels. In order to extend the previously proposed familiarity-aware POI recommendation to be adopted in different areas in the same city, we propose a method that employs visit-frequency-based familiarity and precise POI level of visit-pattern differentiation. In experiments on real LBSN data consists of over 800,000 check-ins for three cities: NYC, LA, and Tokyo, our proposed method outperforms state-of-the-art methods by 0.05 to 0.06 in Recall@20 metric.
We address the problem of extracting functionally similar regions in urban streets and regard such regions as spatial networks. For this purpose, based on our previous algorithm called the FCE method that extracted functional clusters for each network, we propose a new method that efficiently deals with several large-scale networks by accelerating our previous algorithm using lazy evaluation and pivot pruning techniques. Then we present our new techniques for simultaneously comparing the extracted functional clusters of several networks and an effective way of visualizing these clusters by focusing on the fact that the maximum degree of the nodes in spatial networks is restricted to relatively small numbers. In our experiments using urban streets extracted from the OpenStreetMap data of four worldwide cities, we show that our proposed method achieved a reasonably high acceleration performance. Then we show that the functional clusters extracted by it are useful for understanding the properties of areas in a series of visualization results and empirically confirm that our results are substantially different from those obtained by representative centrality measures. These region characteristics will play important roles for developing and planning city promotion and travel tours as well as understanding and improving the usage of urban streets.