Information and Media Technologies

Computing

BiFluX: A Bidirectional Functional Update Language for XML

Tao Zan, Hugo Pacheco, Hsiang-Shang Ko, Zhenjiang Hu

2017Volume 12 Pages 1-23
Published: 2017
Released on J-STAGE: March 15, 2017

DOIhttps://doi.org/10.11185/imt.12.1

JOURNAL FREE ACCESS

Show abstractHide abstract

Different XML formats are widely used for data exchange and processing, being often necessary to mutually convert between them. Standard XML transformation languages, like XSLT or XQuery, are unsatisfactory for this purpose since they require writing a separate transformation for each direction. Existing bidirectional transformation languages mean to cover this gap, by allowing programmers to write a single program that denotes both transformations. However, they often 1) induce a more cumbersome programming style than their traditionally unidirectional relatives, to establish the link between source and target formats, and 2) offer limited configurability, by making implicit assumptions about how modifications to both formats should be translated that may not be easy to predict.
This paper proposes a bidirectional XML update language called BiFluX (BIdirectional FunctionaL Updates for XML), inspired by the Flux XML update language. Our language adopts a novel bidirectional programming by update paradigm, where a program succinctly and precisely describes how to update a source document with a target document in an intuitive way, such that there is a unique “inverse” source query for each update program. BiFluX extends Flux with bidirectional actions that describe the connection between source and target formats. We introduce a core BiFluX language, and translate it into a formally verified bidirectional update language BiGUL to guarantee a BiFluX program is well-behaved.

View full abstract

Download PDF (594K)
Unravel Programming Sessions with THRESHER: Identifying Coherent and Complete Sets of Fine-granular Source Code Changes

Marcel Taeumel, Stephanie Platz, Bastian Steinert, Robert Hirschfeld, ...

2017Volume 12 Pages 24-39
Published: 2017
Released on J-STAGE: March 15, 2017

DOIhttps://doi.org/10.11185/imt.12.24

JOURNAL FREE ACCESS

Show abstractHide abstract

Development teams benefit from version control systems, which manage shared access to code repositories and persist entire project histories for analysis or recovery. Such systems will be efficient if developers commit coherent and complete change sets. These best practices, however, are difficult to follow because multiple activities often interleave without notice and existing tools impede unraveling changes before committing them. We propose an interactive, graphical tool, called Thresher, that employs adaptable scripts to support developers to group and commit changes—especially for fine-granular change tracking where numerous changes are logged even in short programming sessions. We implemented our tool in Squeak/Smalltalk and derived a foundation of scripts from five refactoring sessions. We evaluated those scripts' precision and recall, which indicate a reduced manual effort because developers can focus on project-specific adjustments. Having such an interactive approach, they can easily intervene to accurately reconstruct activities and thus follow best practices.

View full abstract

Download PDF (948K)

Media (processing) and Interaction

A Study of the Steering Time Difference between Narrowing and Widening Circular Tunnels

Shota Yamanaka, Homei Miyashita

2017Volume 12 Pages 40-45
Published: 2017
Released on J-STAGE: March 15, 2017

DOIhttps://doi.org/10.11185/imt.12.40

JOURNAL FREE ACCESS

Show abstractHide abstract

The steering law is a robust model for expressing the relationship between movement time and task difficulty. Recently, a corrected model to calculate the steering time difference between narrowing and widening tunnels was proposed. However, the previous work only conducted a user study with straight paths. This paper presents an investigation of steering performance in narrowing and widening circular tunnels to confirm the corrected model as being either adequate or a limitation. The results show that the steering law achieves a good fit (R² ＞ .98) without the corrected model, thereby indicating the limited benefit of employing the corrected model.

View full abstract

Download PDF (559K)
Unsupervised Word Alignment Using Frequency Constraint in Posterior Regularized EM

Hidetaka Kamigaito, Taro Watanabe, Hiroya Takamura, Manabu Okumura, Ei ...

2017Volume 12 Pages 46-70
Published: 2017
Released on J-STAGE: March 15, 2017

DOIhttps://doi.org/10.11185/imt.12.46

JOURNAL FREE ACCESS

Show abstractHide abstract

Generative word alignment models, such as IBMModels, are restricted to one-to-many alignment, and cannot explicitly represent many-to-many relationships in bilingual texts. The problem is partially solved either by introducing heuristics or by agreement constraints such that two directional word alignments agree with each other. However, this constraint cannot take into account the grammatical difference of language pairs. In particular, function words are not trivial to align for grammatically different language pairs, such as Japanese and English. In this paper, we focus on the posterior regularization framework (Ganchev, Graca, Gillenwater, and Taskar 2010) that can force two directional word alignment models to agree with each other during training, and propose new constraints that can take into account the difference between function words and content words. We discriminate a function word and a content word using word frequency in the same way as done by Setiawan, Kan, and Li (2007). Experimental results show that our proposed constraints achieved better alignment qualities on the French-English Hansard task and the Japanese-English Kyoto free translation task (KFTT) measured by AER and F-measure. In translation evaluations, we achieved statistically significant gains in BLEU scores in the Japanese-English NTCIR10 task and Spanish-English WMT06 task.

View full abstract

Download PDF (862K)
Cooking Recipe Search by Pairs of Ingredient and Action —Word Sequence v.s. Flow-graph Representation—

Yoko Yamakata, Hirokuni Maeta, Takuya Kadowaki, Tetsuro Sasada, Shinji ...

2017Volume 12 Pages 71-79
Published: 2017
Released on J-STAGE: March 15, 2017

DOIhttps://doi.org/10.11185/imt.12.71

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper proposes a method for searching cooking recipes by a procedure such as “a tomato is fried.” Although most of methods for cooking recipe search treat recipe text as Bag-of-Words (BoW), it misdetects such a recipe that “fry an onion deeply and serve it with a tomato cube (in which the tomato is not heated).” Our method converts a procedural text to a flow-graph automatically in advance using a dependency parsing technique. In the flow-graph, action sequence that will be performed to an ingredient is easily extracted by tracing the path from the node corresponding to the ingredient to the root node corresponding to the last action. We evaluate our method comparing with a task adapted BoW model as a baseline and the proposed method achieved a precision of 68.8% while the baseline method achieved it of 61.5%.

View full abstract

Download PDF (877K)
Recognition of Sarcasm in Microblogging Based on Sentiment Analysis and Coherence Identification

Piyoros Tungthamthiti, Kiyoaki Shirai, Masnizah Mohd

2017Volume 12 Pages 80-102
Published: 2017
Released on J-STAGE: June 15, 2017

DOIhttps://doi.org/10.11185/imt.12.80

JOURNAL FREE ACCESS

Show abstractHide abstract

Recognition of sarcasm in microblogging is important in a range of NLP applications, such as opinion mining. However, this is a challenging task, as the real meaning of a sarcastic sentence is the opposite of the literal meaning. Furthermore, microblogging messages are short and usually written in a free style that may include misspellings, grammatical errors, and complex sentence structures. This paper proposes a novel method for identifying sarcasm in tweets. It combines two supervised classifiers, a Support Vector Machine (SVM) using N-gram features and an SVM using our proposed features. Our features represent the intensity and contradictions of sentiment in a tweet, derived by sentiment analysis. The sentiment contradiction feature also considers coherence among multiple sentences in the tweet, and this is automatically identified by our proposed method using unsupervised clustering and an adaptive genetic algorithm. Furthermore, a method for identifying the concepts of unknown sentiment words is used to compensate for gaps in the sentiment lexicon. Our method also considers punctuation and the special symbols that are frequently used in Twitter messaging. Experiments using two datasets demonstrated that our proposed system outperformed baseline systems on one dataset, while producing comparable results on the other. Accuracy of 82% and 76% was achieved in sarcasm identification on the two datasets.

View full abstract

Download PDF (421K)
Detecting XSLT Rules Affected by DTD Updates

Yang WU, Nobutaka SUZUKI

2017Volume 12 Pages 103-110
Published: 2017
Released on J-STAGE: June 15, 2017

DOIhttps://doi.org/10.11185/imt.12.103

JOURNAL FREE ACCESS

Show abstractHide abstract

Schemas of XML documents may be updated for various reasons. If a schema is updated, then XSLT stylesheets are also affected by the schema update. To maintain the consistencies of XSLT stylesheets with updated schemas, we have to detect the XSLT rules affected by schema updates in order to determine whether the XSLT rules need to be updated accordingly. However, detecting such XSLT rules manually is a difficult and time-consuming task, since users do not always fully understand the dependencies between XSLT stylesheets and DTDs. In this paper, we consider three classes of tree transducers as subsets of XSLT, and investigate whether XSLT rules affected by DTD updates can be detected efficiently for these classes.

View full abstract

Download PDF (563K)
Construction of a Multilingual Annotated Corpus for Deeper Sentiment Understanding in Social Media

Yujie Lu, Kotaro Sakamoto, Hideyuki Shibuki, Tatsunori Mori

2017Volume 12 Pages 111-171
Published: 2017
Released on J-STAGE: September 15, 2017

DOIhttps://doi.org/10.11185/imt.12.111

JOURNAL FREE ACCESS

Show abstractHide abstract

The surge of social media use, such as Twitter, introduces new opportunities for understanding and gauging public mood across different cultures. However, the diversity of expression in social media presents a considerable challenge to this task of opinion mining, given the limited accuracy of sentiment classification and a lack of intercultural comparisons. Previous Twitter sentiment corpora have only global polarities attached to them, which prevents deeper investigation of the mechanism underlying the expression of feelings in social media, especially the role and influence of rhetorical phenomena. To this end, we construct an annotated corpus for multilingual Twitter sentiment understanding that encompasses three languages (English, Japanese, and Chinese) and four international topics (iPhone 6, Windows 8, Vladimir Putin, and Scottish Independence); our corpus incorporates 5,422 tweets. Further, we propose a novel annotation scheme that embodies the idea of separating emotional signals and rhetorical context, which, in addition to global polarity, identifies rhetoric devices, emotional signals, degree modifiers, and subtopics. Next, to address low inter-annotator agreement in previous corpora, we propose a pivot dataset comparison method to effectively improve the agreement rate. With manually annotated rich information, our corpus can serve as a valuable resource for the development and evaluation of automated sentiment classification, intercultural comparison, rhetoric detection, etc. Finally, based on observations and our analysis of our corpus, we present three key conclusions. First, languages differ in terms of emotional signals and rhetoric devices, and the idea that cultures have different opinions regarding the same objects is reconfirmed. Second, each rhetoric device maintains its own characteristics, influences global polarity in its own way, and has an inherent structure that helps to model the sentiment that it represents. Third, the models of the expression of feelings in different languages are rather similar, suggesting the possibility of unifying multilingual opinion mining at the sentiment level.

View full abstract

Download PDF (1344K)
Constrained Partial Parsing Based Dependency Tree Projection for Tree-to-Tree Machine Translation

Chenhui Chu, Yu Shen, Fabien Cromieresy, Sadao Kurohashi

2017Volume 12 Pages 172-201
Published: 2017
Released on J-STAGE: September 15, 2017

DOIhttps://doi.org/10.11185/imt.12.172

JOURNAL FREE ACCESS

Show abstractHide abstract

Ideally, tree-to-tree machine translation (MT) that utilizes syntactic parse trees onboth source and target sides could preserve non-local structure, and thus generatefluent and accurate translations. In practice, however, firstly, high quality parsers forboth source and target languages are difficult to obtain; secondly, even if we havehigh quality parsers on both sides, they still can be non-isomorphic because of theannotation criterion difference between the two languages. The lack of isomorphismbetween the parse trees makes it difficult to extract translation rules. This extremelylimits the performance of tree-to-tree MT. In this article, we present an approachthat projects dependency parse trees from the language side that has a high qualityparser, to the side that has a low quality parser, to improve the isomorphism of theparse trees. We first project a part of the dependencies with high confidence to makea partial parse tree, and then complement the remaining dependencies with partialparsing constrained by the already projected dependencies. Experiments conductedon the Japanese-Chinese and English-Chinese language pairs show that our proposedmethod significantly improves the performance on both the two language pairs.

View full abstract

Download PDF (1766K)

Computer Networks and Broadcasting

Time Series Classification via Topological Data Analysis

Yuhei Umeda

2017Volume 12 Pages 228-239
Published: 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.11185/imt.12.228

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper focuses on a classification problem for volatile time series. One of the most popular approaches for time series classification is dynamic time warping and feature-based machine learning architectures. In many previous studies, these algorithms have performed satisfactorily on various datasets. However, most of these methods are not suitable for chaotic time series because the superficial changes in measured values are not essential for chaotic time series. In general, most time series datasets include both chaotic and non-chaotic time series; thus, it is necessary to extract the more essential features of a time series. In this paper, we propose a new approach for volatile time series classification. Our approach generates a novel feature by extracting the structure of the attractor using topological data analysis to represent the transition rules of the time series. As this feature represents the essential property of systems of the time series, our approach is effective for both chaotic and non-chaotic types. We applied a learning architecture inspired by a convolutional neural network to this feature and found that the proposed approach improves performance in a human activity recognition problem by 18.5% compared with conventional approaches.

View full abstract

Download PDF (1737K)
Extending Various Thesauri by Finding Synonym Sets from a Formal Concept Lattice

Madori Ikeda, Akihiro Yamamoto

2017Volume 12 Pages 240-266
Published: 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.11185/imt.12.240

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we solve the problem of extending various thesauri using a single method. Thesauri should be extended when unregistered terms are identified. Various thesauri are available, each of which is constructed according to a unique design principle. We formalise the extension of one thesaurus as a single classification problem in machine learning, with the goal of solving different classification problems. Applying existing classification methods to each thesaurus is time consuming, particularly if many thesauri must be extended. Thus, we propose a method to reduce the time required to extend multiple thesauri. In the proposed method, we first generate clusters of terms without the thesauri that are candidates for synonym sets based on formal concept analysis using the syntactic information of terms in a corpus. Reliable syntactic parsers are easy to use; thus, syntactic information is more available for many terms than semantic information. With syntactic information, for each thesaurus and for all unregistered terms, we can search candidate clusters quickly for a correct synonym set for fast classification. Experimental results demonstrate that the proposed method is faster than existing methods and classification accuracy is comparable.

View full abstract

Download PDF (874K)
Left-Corner Parsing for Identifying PTB-Style Nonlocal Dependencies

Yoshihide Kato, Shigeki Matsubara

2017Volume 12 Pages 267-290
Published: 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.11185/imt.12.267

JOURNAL FREE ACCESS

Show abstractHide abstract

Nonlocal dependencies represent syntactic phenomenon such as wh-movement, A-movement in passives, topicalization, raising, control, and right node raising. Nonlocal dependencies play an important role in semantic interpretation. This paper proposes a left-corner parser that identifies nonlocal dependencies. Our parser integrates nonlocal dependency identification into a transition-based system. We adopt a left-corner strategy in order to use the syntactic relation c-command, which plays an important role in nonlocal dependency identification. To utilize the global features captured by nonlocal dependencies, our parser uses a structured perceptron. In experimental evaluations, our parser achieved a good balance between constituent parsing and nonlocal dependency identification.

View full abstract

Download PDF (868K)
Generalized Hierarchical Word Sequence Framework for Language Modeling

Xiaoyi Wu, Kevin Duh, Yuji Matsumoto

2017Volume 12 Pages 291-315
Published: 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.11185/imt.12.291

JOURNAL FREE ACCESS

Show abstractHide abstract

Language modeling is a fundamental research problem that has wide application for many NLP tasks. For estimating probabilities of natural language sentences, most research on language modeling use n-gram based approaches to factor sentence probabilities. However, the assumption under n-gram models is not robust enough to cope with the data sparseness problem, which affects the final performance of language models. In this paper, we propose a generalized hierarchical word sequence framework, where different word association scores can be adopted to rearrange word sequences in a totally unsupervised fashion. Unlike the n-gram which factors sentence probability from left-to-right, our model factors using a more flexible strategy. For evaluation, we compare our rearranged word sequences to normal n-gram word sequences. Both intrinsic and extrinsic experiments verify that our language model can achieve better performance, proving that our method can be considered as a better alternative for n-gram language models.

View full abstract

Download PDF (1050K)
Phrase Structure Annotation and Parsing for Learner English

Keisuke Sakaguchi, Ryo Nagata

2017Volume 12 Pages 316-339
Published: 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.11185/imt.12.316

JOURNAL FREE ACCESS

Show abstractHide abstract

Learner English often contains grammatical errors with structural characteristics such as omissions, insertions, substitutions, and word order errors. These errors are not covered by the existing context-free grammar (CFG) rules. Therefore, it is not at all straightforward how to annotate learner English with phrase structures. Because of this limitation, there has been almost no work on phrase structure annotation for learner corpora despite its importance and usefulness. To address this issue, we propose a phrase structure annotation scheme for learner English, that consists of five principles. We apply the annotation scheme to two different learner corpora and show (i) its effectiveness at consistently annotating learner English with phrase structure (i.e., high inter-annotator agreement); (ii) the structural characteristics (CFG rules) of learner English obtained from the annotated corpora; and (iii) phrase structure parsing performance on learner English for the first time. We also release the annotation guidelines, the annotated data, and the parser model to the public.

View full abstract

Download PDF (377K)
An Algorithm to Detect Midair Multi-Clicks Gestures

Hani Karam, Jiro Tanaka

2017Volume 12 Pages 340-351
Published: 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.11185/imt.12.340

JOURNAL FREE ACCESS

Show abstractHide abstract

Selection mechanism gestures are used in Natural User Interfaces (NUI) to designate elements in a User Interface. They usually involve simple gestures with limited interactions. Being able to use more than one gesture simultaneously increases the vocabulary of the interactions. In this paper, we present MultiX Click, a new algorithm to detect midair multi-click gestures. Our approach allows the detection of multiple midair finger clicks using a depth sensor. To show the potential of our algorithm, we implemented a midair multi-click keyboard and a midair piano that use simultaneous multi-clicks. In the midair multi-click keyboard, we mixed single and multiple clicks with the ability to retrieve the location of a click, and as a result we were able to increase the gesture vocabulary. This paper explains in detail the algorithm we used to detect multi-clicks. We also explain about some preliminary experiments for evaluating it.

View full abstract

Download PDF (14796K)
High-Modularity Network Generation Model Based on the Multilayer Network

Chao Fan, Fujio Toriumi

2017Volume 12 Pages 352-362
Published: 2017
Released on J-STAGE: December 15, 2017

DOIhttps://doi.org/10.11185/imt.12.352

JOURNAL FREE ACCESS

Show abstractHide abstract

Many models synthesize various types of complex networks with communities. However, a network generation model that can represent high-modularity networks is rare. In this paper, we propose a high-modularity network generation model by layer aggregation based on a multilayer network. Because people belong to many communities in society, such as family, school, hobby group, and business organizations, each example is regarded as a community in a single layer of a multilayer network. However, measuring each relationship in each community is difficult. A network on social network services (SNSs) that can be observed combines all communities. That is, a social network is generated from a multilayer network. A synthesized network in our model has either a community structure or a high-modularity structure. We apply the proposed model to generate a number of networks and compare them with real-world networks. Not only did it successfully represent real-world networks but we also found that we can predict how real-world networks are generated from the model's parameters.

View full abstract

Download PDF (742K)

Information and Media Technologies Ceased New Publication

IMT Editorial Board

2018Volume 12 Pages 363
Published: 2018
Released on J-STAGE: March 08, 2018

DOIhttps://doi.org/10.11185/imt.12.363

JOURNAL FREE ACCESS

Show abstractHide abstract

Information and Media Technologies (IMT) ceased publication after Vol. 12. As regards papers hereafter published by the societies participating in IMT (The Association for Natural Language Processing, Human Interface Society, The Institute of Image Information and Television Engineers, Information Processing Society of Japan, The Japanese Society for Artificial Intelligence, Japan Society for Software Science and Technology, and The Database Society of Japan), check their websites out.
Papers previously published in IMT will be available at J-STAGE as before.

View full abstract

Download PDF (104K)

Register with J-STAGE for free!