Information and Media Technologies

Hardware and Devices

Test and Design-for-Testability Solutions for 3D Integrated Circuits

Krishnendu Chakrabarty, Mukesh Agrawal, Sergej Deutsch, Brandon Noia, ...

2014 年 9 巻 4 号 p. 386-403
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.386

ジャーナルフリー

抄録を表示する抄録を非表示にする

Despite the promise and benefits offered by 3D integration, testing remains a major obstacle that hinders its widespread adoption. Test techniques and design-for-testability (DfT) solutions for 3D ICs are now being studied in the research community, and experts in industry have identified a number of hard problems related to the lack of probe access for wafers, test access in stacked dies, yield enhancement, and new defects arising from unique processing steps. We describe a number of testing and DfT challenges, and present some of the solutions being advocated for these challenges. Techniques highlighted in this paper include: (i) pre-bond testing of TSVs and die logic, including probing and non-invasive test using DfT; (ii) post-bond testing and DfT innovations related to the optimization of die wrappers, test scheduling, and access to dies and inter-die interconnects; (iii) interconnect testing in interposer-based 2.5D ICs; (iv) fault diagnosis and TSV repair; (v) cost modeling and test-flow selection.

抄録全体を表示

PDF形式でダウンロード (1798K)
A Sophisticated Routing Algorithm in 3D NoC with Fixed TSVs for Low Energy and Latency

Xin Jiang, Lian Zeng, Takahiro Watanabe

2014 年 9 巻 4 号 p. 404-412
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.404

ジャーナルフリー

抄録を表示する抄録を非表示にする

With rapid progress in Integrated Circuit technologies, Three-Dimensional Network-on-Chips (3DNoCs) have become a promising solution for achieving low latency and low power. Under the constraint of the TSV number used in 3DNoCs, designing a proper routing algorithm with fewer TSVs is a critical problem for network performance improvement. In this work, we design a novel fully adaptive routing algorithm in 3D NoC. The algorithm consists of two parts: one is a vertical node assignment in inter-layer routing, which is a TSV selection scheme in a limited quantity of TSVs in the NoC architecture, and the other is a 2D fully adaptive routing algorithm in intra-layer routing, which combines the optimization of routing distance, network traffic condition and diversity of the path selection. Simulation results show that our proposed routing algorithm can achieve lower latency and energy consumption compared with other traditional routing algorithms.

抄録全体を表示

PDF形式でダウンロード (1793K)
Forwarding Unit Generation for Loop Pipelining in High-level Synthesis

Shingo Kusakabe, Kenshu Seto

2014 年 9 巻 4 号 p. 413-418
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.413

ジャーナルフリー

抄録を表示する抄録を非表示にする

In the loop pipelining of high-level synthesis, the reduction of initiation intervals (IIs) is very important. Existing loop pipelining techniques, however, pessimistically assumes that dependences whose occurrences can be determined only at runtime always occur, resulting in increased IIs. To address this issue, recent work achieves reduced II by a source code transformation which introduces runtime dependence analysis and performs pipeline stalls when the dependences actually occur. Unfortunately, the recent work suffers from the increased execution cycles by frequent pipeline stalls under the frequent occurrences of the dependences. In this paper, we propose a technique to reduce IIs in which data written to memories are also written to registers for such dependences of read-after-write (RAW) type. In our technique, registers which are faster than memories are accessed when the RAW dependences occur. Since the proposed technique achieved the reduction of the execution cycles by 34% with 15% gate count increase on average for three examples compared to the state-of-the-art technique, the proposed technique is effective for synthesizing high-speed circuits with loop pipelining.

抄録全体を表示

PDF形式でダウンロード (396K)
Design Aid of Multi-core Embedded Systems with Energy Model

Takashi Nakada, Kazuya Okamoto, Toshiya Komoda, Shinobu Miwa, Yohei Sa ...

2014 年 9 巻 4 号 p. 419-428
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.419

ジャーナルフリー

抄録を表示する抄録を非表示にする

Shifting to multi-core designs is so pervasive a trend to overcome the power wall and it is a necessary move for embedded systems in our rapidly evolving information society. Meanwhile, the need to increase the battery life and reduce maintenance costs for such embedded systems is very critical. Therefore, a wide variety of power reduction techniques have been proposed and realized, including Clock Gating, DVFS and Power Gating. To maximize the effectiveness of these techniques, task scheduling is a key but for multi-core systems it is very complicated due to the huge exploration space. This problem is a major obstacle for further power reduction. To cope with it, we propose a design method for embedded systems to minimize their energy consumption under performance constraints. This method is based on the clarification of properties of the above mentioned low power techniques and their interactions. In more details, we firstly establish energy models for these low power techniques and our target systems. We then explore for the best configuration by constructing an optimization problem especially for applications which have a longer deadline than the execution interval. Finally, we propose an approximate solution using dynamic programming with a lower computation complexity and compare it to a brute force explicit solution. We confirm with our evaluations that the proposed method successfully found a better configuration which reduces the total energy consumption by 32% if compared to the manually optimized configuration, which utilizes only one core.

抄録全体を表示

PDF形式でダウンロード (1134K)

Computing

A Survey on Large Scale Corpora and Emotion Corpora

Michal Ptaszynski, Rafal Rzepka, Satoshi Oyama, Masahito Kurihara, Ken ...

2014 年 9 巻 4 号 p. 429-445
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.429

ジャーナルフリー

抄録を表示する抄録を非表示にする

In this paper we present a survey on natural language corpora, with particular focus on corpora of large scale and those applicable to sentiment analysis. Natural language corpora are crucial for training various Software Engineering applications, from part-of-speech taggers and dependency parsers to dialog systems or sentiment analysis software. We compare several natural language corpora created for different languages, analyze their distinctive features and the amount of additional annotations provided by the developers of those corpora.

抄録全体を表示

PDF形式でダウンロード (949K)
A Delay-variation-aware High-level Synthesis Algorithm for RDR Architectures

Yuta Hagio, Masao Yanagisawa, Nozomu Togawa

2014 年 9 巻 4 号 p. 446-455
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.446

ジャーナルフリー

抄録を表示する抄録を非表示にする

As device feature size drops, interconnection delays often exceed gate delays. We have to incorporate interconnection delays even in high-level synthesis. Using RDR architectures is one of the effective solutions to this problem. At the same time, process and delay variation also becomes a serious problem which may result in several timing errors. How to deal with this problem is another key issue in high-level synthesis. In this paper, we propose a delay-variation-aware high-level synthesis algorithm for RDR architectures. We first obtain a non-delayed scheduling/binding result and, based on it, we also obtain a delayed scheduling/binding result. By adding several extra functional units to vacant RDR islands, we can have a delayed scheduling/binding result so that its latency is not much increased compared with the non-delayed one. After that, we similarize the two scheduling/binding results by repeatedly modifying their results. We can finally realize non-delayed and delayed scheduling/binding results simultaneously on RDR architecture with almost no area/performance overheads and we can select either one of them depending on post-silicon delay variation. Experimental results show that our algorithm successfully reduces delayed scheduling/binding latency by up to 42.9% compared with the conventional approach.

抄録全体を表示

PDF形式でダウンロード (525K)
Reinforcing Random Testing of Arithmetic Optimization of C Compilers by Scaling up Size and Number of Expressions

Eriko Nagai, Atsushi Hashimoto, Nagisa Ishiura

2014 年 9 巻 4 号 p. 456-465
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.456

ジャーナルフリー

抄録を表示する抄録を非表示にする

This paper presents an enhanced method of testing validity of arithmetic optimization of C compilers using randomly generated programs. Its bug detection capability is improved over an existing method by 1) generating longer arithmetic expressions and 2) accommodating multiple expressions in test programs. Undefined behavior in long expressions is successfully eliminated by modifying problematic subexpressions during computation of expected values for the expressions. A new method for including floating point operations into compiler random testing is also proposed. Furthermore, an efficient method for minimizing error inducing test programs is presented, which utilizes binary search. Experimental results show that a random test system based on our method has higher bug detection capability than existing methods; it has detected more bugs than previous method in earlier versions of GCCs and has revealed new bugs in the latest versions of GCCs and LLVMs.

抄録全体を表示

PDF形式でダウンロード (317K)
An Area Efficient Regular Expression Matching Engine Using Partial Reconfiguration for Quick Pattern Updating

Yoichi Wakaba, Shin'ichi Wakabayashi, Shinobu Nagayama, Masato Inagi

2014 年 9 巻 4 号 p. 466-474
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.466

ジャーナルフリー

抄録を表示する抄録を非表示にする

This paper proposes a method using partial reconfiguration to realize a compact regular expression matching engine, which can update a pattern quickly. In the proposed method, a set of partial circuits, each of which handles a different class of regular expressions, are provided in advance. When a regular expression pattern is given, a compact matching engine dedicated to the pattern is implemented on FPGA by combining the partial circuits according to the given pattern using partial reconfiguration. The method can update a pattern quickly, since it does not need re-design of a circuit. Experimental results show that the proposed method reduces 60% circuit size compared with the previous method without increasing the pattern updating time significantly.

抄録全体を表示

PDF形式でダウンロード (1498K)
An On-The-Fly Algorithm for Conditional Weighted Pushdown Systems

Hua Vy Le Thanh, Xin Li

2014 年 9 巻 4 号 p. 475-481
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.475

ジャーナルフリー

抄録を表示する抄録を非表示にする

Pushdown systems (PDSs) are well-understood as abstract models of recursive sequential programs, and weighted pushdown systems (WPDSs) are a general framework for solving certain meet-over-all-path problems in program analysis. Conditional WPDSs (CWPDSs) further extend WPDSs to enhance the expressiveness of WPDSs, in which each transition is guarded by a regular language over the stack that specifies conditions under which a transition rule can be applied. CWPDSs or its instance are shown to have wide applications in analysis of objected-oriented programs, access rights analysis, etc. Model checking CWPDSs was shown to be reduced to model checking WPDSs, and an offline algorithm was given that translates CWPSs to WPDSs by synchronizing the underlying PDS and finite state automata accepting regular conditions. The translation, however, can cause an exponential blow-up of the system. This paper presents an on-the-fly model checking algorithm for CWPDSs that synchronizes the computing machineries on-demand while computing post-images of regular configurations. We developed an on-the-fly model checker for CWPDSs and apply it to models generated from the reachability analysis of the HTML5 parser specification. Our preliminary experiments show that, the on-the-fly algorithm drastically outperforms the offline algorithm regarding both practical space and time efficiency.

抄録全体を表示

PDF形式でダウンロード (229K)
A Method for Reliable P2P Video Streaming using Variable Bitrate Video Formats

Marat Zhanikeev

2014 年 9 巻 4 号 p. 482-493
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.482

ジャーナルフリー

抄録を表示する抄録を非表示にする

PP2P streaming today uses constant bitrate video, primarily because it is easier to cut such video into substreams and deliver it via multiple peers. The same method suffers from low reliability of end-to-end throughput which can cause playback freezes. This paper proposes a variable-bitrate method for P2P streaming which solves this problem. The proposed method outperforms traditional P2P streaming by a large margin and provides a highly resilient streaming platform.

抄録全体を表示

PDF形式でダウンロード (322K)
Signal Processing Algorithm Development for Mass++ (Ver. 2): Platform Software for Mass Spectrometry

Shin-ichi Utsunomiya, Yuichiro Fujita, Satoshi Tanaka, Shigeki Kajihar ...

2014 年 9 巻 4 号 p. 494-499
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.494

ジャーナルフリー

抄録を表示する抄録を非表示にする

Mass++ is free platform software for mass spectrometry, mainly developed for biological science, with which users can construct their own functions or workflows for use as plug-ins. In this paper, we present an algorithm development example using Mass++ that performs a new baseline subtraction method. A signal processing technique previously developed to correct the atmospheric substances in infrared spectroscopy was converted to adjust to the mass spectrum baseline estimation, and a new method called Bottom Line Tracing (BLT) was constructed. BLT can estimate a suitable baseline for a mass spectrum with rapid changes in its waveform with easy parameter tuning. We confirm that it is beneficial to utilize techniques or knowledge acquired in another field to obtain a better solution for a problem, and that the practical barriers to algorithm development and distribution will be considerably reduced by platform software like Mass++.

抄録全体を表示

PDF形式でダウンロード (1450K)

Media (processing) and Interaction

Video Completion via Spatio-temporally Consistent Motion Inpainting

Menandro Roxas, Takaaki Shiratori, Katsushi Ikeuchi

2014 年 9 巻 4 号 p. 500-504
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.500

ジャーナルフリー

抄録を表示する抄録を非表示にする

Given an image sequence with corrupted pixels, usually big holes that span over several frames, we propose to complete the missing parts using an iterative optimization approach which minimizes an optical flow functional and propagates the color information simultaneously. Inside one iteration of the optical flow estimation, we use the solved motion field to propagate the color and then use the newly inpainted color back to the brightness constraint of the optical flow functional. We then introduce a spatially dependent blending factor, called the mask function, to control the effect of the newly propagated color. We also add a trajectory constraint by solving the forward and backward flow simultaneously using three frames. Finally, we minimize the functional by using alternating direction method of multipliers.

抄録全体を表示

PDF形式でダウンロード (1235K)
Multidimensional Matching of Tactile Sensations of Materials and Vibrotactile Spectra

Yoichiro Matsuura, Shogo Okamoto, Hikaru Nagano, Yoji Yamada

2014 年 9 巻 4 号 p. 505-516
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.505

ジャーナルフリー

抄録を表示する抄録を非表示にする

Specifying the relationship between the sensations perceived by material surfaces and the tactile stimuli presented to human finger pad is often difficult in tactile texture studies. Both human texture perception and the physical stimuli presented to the skin are expressed as multidimensional information spaces. We developed a computational technique for matching these texture and physical stimulus spaces based on multivariate analysis approaches. The texture space is established via a semantic differential method. The physical space is based on vibrotactile spectrum information, one of the most commonly used principles for the analysis and artificial presentation of textures. The bases of the physical space were determined to ensure that the material allocations for the two spaces were similar, and we obtained well-matched spaces for 18 material samples. These successfully matched spaces will provide an analytic tool for material textures, and will help users of vibrotactile texture displays design virtual materials using adjectives or the names of materials.

抄録全体を表示

PDF形式でダウンロード (1465K)
Using WFSTs for Efficient EM Learning of Probabilistic CFGs and Their Extensions

Yoshitaka Kameya, Takashi Mori, Taisuke Sato

2014 年 9 巻 4 号 p. 517-556
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.517

ジャーナルフリー

抄録を表示する抄録を非表示にする

Probabilistic context-free grammars (PCFGs) are a widely known class of probabilistic language models. The Inside-Outside (I-O) algorithm is well known as an efficient EM algorithm tailored for PCFGs. Although the algorithm requires inexpensive linguistic resources, there remains a problem in its efficiency. This paper presents an efficient method for training PCFG parameters in which the parser is separated from the EM algorithm, assuming that the underlying CFG is given. A new EM algorithm exploits the compactness of well-formed substring tables (WFSTs) generated by the parser. Our proposal is general in that the input grammar need not take Chomsky normal form (CNF) while it is equivalent to the I-O algorithm in the CNF case. In addition, we propose a polynomial-time EM algorithm for CFGs with context-sensitive probabilities, and report experimental results with the ATR dialogue corpus and a hand-crafted Japanese grammar.

抄録全体を表示

PDF形式でダウンロード (1996K)
Construction of Practical Japanese Parsing System Based on Lexical Functional Grammar

Hiroshi Masuichi, Tomoko Ohkuma

2014 年 9 巻 4 号 p. 557-575
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.557

ジャーナルフリー

抄録を表示する抄録を非表示にする

This paper describes a Japanese parsing system with a linguistically fine-grained grammar based on Lexical-Functional Grammar (LFG). The system is the first Japanese LFG parser with over 97% coverage of real-world text. We evaluated the accuracy of the system by comparing it with standard Japanese dependency parsers. The LFG parser shows roughly equivalent performance in dependency accuracy with standard parsers. It also provides reasonably accurate results of case detection.

抄録全体を表示

PDF形式でダウンロード (1162K)
Gradual Fertilization of Case Frames

Daisuke Kawahara, Sadao Kurohashi

2014 年 9 巻 4 号 p. 576-603
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.576

ジャーナルフリー

抄録を表示する抄録を非表示にする

This article proposes an automatic method of gradually constructing case frames. First, a large raw corpus is parsed, and base case frames are constructed from reliable predicate-argument examples in the parsing results. Second, case analysis based on the base case frames is applied to the large corpus, and the case frames are upgraded by incorporating newly acquired information. Case frames are gradually fertilized in this way. We constructed case frames from 26 years of newspaper articles consisting of approximately 26 million sentences. The case frames are evaluated manually as well as through syntactic and case analyses. These results presented the effectiveness of the constructed case frames.

抄録全体を表示

PDF形式でダウンロード (466K)
Improvements of Katz K Mixture Model

Yinghui Xu, Kyoji Umemura

2014 年 9 巻 4 号 p. 604-629
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.604

ジャーナルフリー

抄録を表示する抄録を非表示にする

A simpler distribution that fits empirical word distribution about as well as a negative binomial is the Katz K mixture. In the K mixture model, the basic assumption is that the conditional probabilities of repeats for a given word are determined by a constant decay factor that is independent of the number of occurrences which have taken place. However, the probabilities of the repeat occurrences are generally lower than the constant decay factor for the content-bearing words with few occurrences that have taken place. To solve this deficiency of the K mixture model, in-depth exploration of the characteristics of the conditional probabilities of repetitions, decay factors and their influences on modeling term distributions was conducted. Based on the results of this study, it appears that both ends of the distribution can be used to fit models. That is, not only can document frequencies be used when the instances of a word are few, but also tail probabilities (the accumulation of document frequencies). Both document frequencies for few instances of a word and tail probabilities for large instances are often relatively easy to estimate empirically. Therefore, we propose an effective approach for improving the K mixture model, where the decay factor is the combination of two possible decay factors interpolated by a function depending on the number of instances of a word in a document. Results show that the proposed model can generate a statistically significant better estimation of frequencies, especially the frequency estimation for a word with two instances in a document. In addition, it is shown that the advantages of this approach will become more evident in two cases, modeling the term distribution for the frequently used content-bearing word and modeling the term distribution for a corpus with a wide range of document length.

抄録全体を表示

PDF形式でダウンロード (840K)
Preference Dependency Grammar and its Packed Shared Data Structure “Dependency Forest”

Hideki Hirakawa

2014 年 9 巻 4 号 p. 630-694
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.630

ジャーナルフリー

抄録を表示する抄録を非表示にする

Preference dependency grammar (PDG) is a framework for integrating morphological, syntactic, and semantic analyses. PDG provides packed shared data structures that can efficiently encompass all possible interpretations at each level of sentence analyses with preference scores. Using the structure, PDG can calculate a globally optimized interpretation for the target sentence. This paper first gives an overview of the PDG framework by describing the base model of PDG, which is a sentence analysis model, called a “multi-level packed shared data connection model.” Then this paper describes packed shared data structures, e.g., headed parse forests and dependency forests, adopted in PDG. Finally, the completeness and soundness of the mapping between the parse forest and the dependency forest are revealed.

抄録全体を表示

PDF形式でダウンロード (3085K)
A Fully-Lexicalized Probabilistic Model for Japanese Syntactic and Case Structure Analysis

Daisuke Kawahara, Sadao Kurohashi

2014 年 9 巻 4 号 p. 695-711
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.695

ジャーナルフリー

抄録を表示する抄録を非表示にする

We present an integrated probabilistic model for Japanese syntactic and case structure analysis. Syntactic and case structures are simultaneously analyzed on the basis of wide-coverage case frames that are constructed from a huge raw corpus in an unsupervised manner. This model selects the syntactic and case structures that have the highest generative probability. We evaluate both syntactic structure and case structure. In particular, the experimental results for syntactic analysis on web sentences show that the proposed model significantly outperforms the known syntactic analyzers.

抄録全体を表示

PDF形式でダウンロード (318K)
Construction of a Domain Dictionary for Fundamental Vocabulary and its Application to Automatic Blog Categorization Using Dynamically Estimated Domains of Unknown Words

Chikara Hashimoto, Sadao Kurohashi

2014 年 9 巻 4 号 p. 712-735
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.712

ジャーナルフリー

抄録を表示する抄録を非表示にする

The semantic relations between words are essential for natural language understanding. Toward deeper natural language understanding, we semi-automatically constructed a domain dictionary that represents the domain relations between fundamental Japanese words. Our method does not require a document collection. As a task-based evaluation of the domain dictionary, we categorized blogs by assigning a domain for each word in a blog article and categorizing it as the most dominant domain. Thus, we dynamically estimated the domains of unknown words, (i.e., those not listed in the domain dictionary), resulting in our blog categorization achieving an accuracy of 94.0% (564/600). Moreover, the domain estimation technique for unknown words achieved an accuracy of 76.6% (383/500).

抄録全体を表示

PDF形式でダウンロード (796K)
Generalization of Semantic Roles in Automatic Semantic Role Labeling

Yuichiroh Matsubayashi, Naoaki Okazaki, Jun'ichi Tsujii

2014 年 9 巻 4 号 p. 736-770
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.736

ジャーナルフリー

抄録を表示する抄録を非表示にする

Numerous studies have applied machine-learning approaches to semantic role labeling with the availability of corpora such as FrameNet and PropBank. These corpora define frame-specific semantic roles for each frame, which are problematic for a machine-learning approach because the corpus contains a number of infrequent roles that hinder efficient learning. This paper focuses on the generalization problem of semantic roles in a semantic role labeling task. We compare existing generalization criteria with our novel criteria, and clarify the characteristics of each criterion. We also show that using multiple generalization criteria in a single model improves the performance of a semantic role classification. In experiments on FrameNet, we achieved 19.16% error reduction in terms of total accuracy, and 7.42% in macro-averaged F1. On PropBank, we reduced 24.07% of errors in total accuracy, and 26.39% of errors in the evaluation for unseen verbs.

抄録全体を表示

PDF形式でダウンロード (1739K)
Study on Constants of Natural Language Texts

Daisuke Kimura, Kumiko Tanaka-Ishii

2014 年 9 巻 4 号 p. 771-789
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.771

ジャーナルフリー

抄録を表示する抄録を非表示にする

This paper considers different measures that might become constants for any length of a given natural language text. Such measures indicate a potential for studying the complexity of natural language but have previously only been studied using relatively small English texts. In this study, we consider measures for texts in languages other than English, and for large-scale texts. Among the candidate measures, we consider Yule's K, Orlov's Z, and Golcher's VM, each of whose convergence has been previously argued empirically. Furthermore, we introduce entropy H, and a measure, r, related to the scale-free property of language. Our experiments show that both K and VM are convergent for texts in various languages, whereas the other measures are not.

抄録全体を表示

PDF形式でダウンロード (847K)
Splitting Katakana Noun Compounds by Paraphrasing and Back-transliteration

Nobuhiro Kaji, Masaru Kitsuregawa

2014 年 9 巻 4 号 p. 790-813
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.790

ジャーナルフリー

抄録を表示する抄録を非表示にする

Word boundaries within noun compounds in a number of languages, including Japanese, are not marked by white spaces. Thus, it is beneficial for various NLP applications to split such noun compounds. In the case of Japanese, noun compounds composed of katakana words are particularly difficult to split because katakana words are highly productive and are often out of vocabulary. Therefore, we propose using paraphrasing and back-transliteration of katakana noun compounds to split them. Experiments in which paraphrases and back-transliterations from unlabeled textual data were extracted and used to construct splitting models improved splitting accuracy with statistical significance.

抄録全体を表示

PDF形式でダウンロード (460K)
Relevance Feedback using Surface and Latent Information in Texts

Jun Harashima, Sadao Kurohashi

2014 年 9 巻 4 号 p. 814-833
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.814

ジャーナルフリー

抄録を表示する抄録を非表示にする

Most relevance feedback methods re-rank search results using only the information of surface words in texts. We present a method that uses not only the information of surface words but also that of latent words that are inferred from texts. We infer latent word distribution in each document in the search results using latent Dirichlet allocation (LDA). When feedback is given, we also infer the latent word distribution in the feedback using LDA. We calculate the similarities between the user feedback and each document in the search results using both the surface and latent word distributions and re-rank the search results on the basis of the similarities. Evaluation results show that when user feedback consisting of two documents (3, 589 words) is given, the proposed method improves the initial search results by 27.6% in precision at 10 (P@10). Additionally, it proves that the proposed method can perform well even when only a small amount of user feedback is available. For example, an improvement of 5.3% in P@10 was achieved when user feedback constituted only 57 words.

抄録全体を表示

PDF形式でダウンロード (587K)
Particle Error Correction from Small Error Data for Japanese Learners

Kenji Imamura, Kuniko Saito, Kugatsu Sadamitsu, Hitoshi Nishikawa

2014 年 9 巻 4 号 p. 834-856
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.834

ジャーナルフリー

抄録を表示する抄録を非表示にする

This paper shows how to correct the grammatical errors of Japanese particles made by Japanese learners. Our method is based on discriminative sequence conversion, which converts one sequence of words into another and corrects particle errors by substitution, insertion, or deletion. However, it is difficult to collect large learners' corpora. We solve this problem with a discriminative learning framework that uses the following two methods. First, language model probabilities obtained from large, raw text corpora are combined with n-gram binary features obtained from learners' corpora. This method is applied to measure the accuracy of Japanese sentences. Second, automatically generated pseudo-error sentences are added to learners' corpora to enrich the corpora directly. Furthermore, we apply domain adaptation, in which the pseudo-error sentences (the source domain) are adapted to the real error sentences (the target domain). Experiments show that the recall rate is improved using both language model probabilities and n-gram binary features. Stable improvement is achieved using pseudo-error sentences with domain adaptation.

抄録全体を表示

PDF形式でダウンロード (660K)
A Generative Dependency N-gram Language Model: Unsupervised Parameter Estimation and Application

Chenchen Ding, Mikio Yamamoto

2014 年 9 巻 4 号 p. 857-885
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.857

ジャーナルフリー

抄録を表示する抄録を非表示にする

We design a language model based on a generative dependency structure for sentences. The parameter of the model is the probability of a dependency N-gram, which is composed of lexical words with four types of extra tag used to model the dependency relation and valence. We further propose an unsupervised expectation-maximization algorithm for parameter estimation, in which all possible dependency structures of a sentence are considered. As the algorithm is language-independent, it can be used on a raw corpus from any language, without any part-of-speech annotation, tree-bank or trained parser. We conducted experiments using four languages, i.e., English, German, Spanish and Japanese, to illustrate the applicability and the properties of the proposed approach. We further apply the proposed approach to a Chinese microblog data set to extract and investigate Internet-based, non-standard lexical dependency features of user-generated content.

抄録全体を表示

PDF形式でダウンロード (726K)
Discovering Seismic Interactions after the 2011 Tohoku Earthquake by Co-occurring Cluster Mining

Ken-ichi Fukui, Daiki Inaba, Masayuki Numao

2014 年 9 巻 4 号 p. 886-895
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.886

ジャーナルフリー

抄録を表示する抄録を非表示にする

In this study, we extract earthquake co-occurrence patterns for investigating mechanical interactions in the affected areas. To extract seismic patterns, both co-occurrence among seismic events in the event sequence and distances between the hypocenters to find hot spots must be considered. Most previous researches, however, have considered only one of these aspects. In contrast, we utilized co-occurring cluster mining to extract seismic patterns by considering both co-occurrence in a sequence and distance between hypocenters. Then, we acquired affected areas and relationships between the co-occurrence patterns and focal mechanisms from the 2011-2012 hypocenter catalog. Some results were consistent with seismological literature. The results include highly affected areas that may indicate asperity, and change of focal mechanisms before and after the Tohoku Earthquake.

抄録全体を表示

PDF形式でダウンロード (4049K)
Mobile Camera Localization Using Aerial-view Images

Hisatoshi Toriya, Itaru Kitahara, Yuichi Ohta

2014 年 9 巻 4 号 p. 896-904
発行日: 2014年
公開日: 2014/12/15

DOIhttps://doi.org/10.11185/imt.9.896

ジャーナルフリー

抄録を表示する抄録を非表示にする

This paper proposes a method to estimate a mobile camera's position and orientation by referring to the corresponding points between aerial-view images from a GIS database and mobile camera images. The mobile camera images are taken from the user's viewpoint, and the aerial-view images include the same region. To increase the correspondence accuracy, we generate a virtual top-view image that virtually captures the target region overhead of the user by using the intrinsic parameters of the mobile camera and the inertia (gravity) information. We find corresponding points between the virtual top-view and aerial-view images and estimate a homography matrix that transforms the virtual top-view image into aerial-view image. Finally, the mobile camera's position and orientation are estimated by analyzing the matrix. In some cases, however, it is difficult to obtain a sufficient number of correct corresponding points to estimate the correct homography matrix by capturing only a single virtual top-view image. We solve this problem by stitching virtual top-view images to represent a larger ground region. We experimentally implemented our method on a tablet PC and evaluated its effectiveness.

抄録全体を表示

PDF形式でダウンロード (23007K)

J-STAGEへの登録はこちら（無料）