IEICE Transactions on Information and Systems
Online ISSN : 1745-1361
Print ISSN : 0916-8532
Volume E91.D, Issue 11
Displaying 1-28 of 28 articles from this issue
Special Section on Knowledge, Information and Creativity Support System
  • Thanaruk THEERAMUNKONG
    2008 Volume E91.D Issue 11 Pages 2543-2544
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Download PDF (54K)
  • Izumi SUZUKI, Yoshiki MIKAMI, Ario OHSATO
    Article type: PAPER
    Subject area: Knowledge Acquisition
    2008 Volume E91.D Issue 11 Pages 2545-2551
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    A technique that acquires documents in the same category with a given short text is introduced. Regarding the given text as a training document, the system marks up the most similar document, or sufficiently similar documents, from among the document domain (or entire Web). The system then adds the marked documents to the training set to learn the set, and this process is repeated until no more documents are marked. Setting a monotone increasing property to the similarity as it learns enables the system to 1) detect the correct timing so that no more documents remain to be marked and to 2) decide the threshold value that the classifier uses. In addition, under the condition that the normalization process is limited to what term weights are divided by a p-norm of the weights, the linear classifier in which training documents are indexed in a binary manner is the only instance that satisfies the monotone increasing property. The feasibility of the proposed technique was confirmed through an examination of binary similarity and using English and German documents randomly selected from the Web.
    Download PDF (192K)
  • Chihiro ONO, Yasuhiro TAKISHIMA, Yoichi MOTOMURA, Hideki ASOH, Yasuhid ...
    Article type: PAPER
    Subject area: Knowledge Acquisition
    2008 Volume E91.D Issue 11 Pages 2552-2559
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    This paper proposes a novel approach of constructing statistical preference models for context-aware personalized applications such as recommender systems. In constructing context-aware statistical preference models, one of the most important but difficult problems is acquiring a large amount of training data in various contexts/situations. In particular, some situations require a heavy workload to set them up or to collect subjects capable of answering the inquiries under those situations. Because of this difficulty, it is usually done to simply collect a small amount of data in a real situation, or to collect a large amount of data in a supposed situation, i.e., a situation that the subject pretends that he is in the specific situation to answer inquiries. However, both approaches have problems. As for the former approach, the performance of the constructed preference model is likely to be poor because the amount of data is small. For the latter approach, the data acquired in the supposed situation may differ from that acquired in the real situation. Nevertheless, the difference has not been taken seriously in existing researches. In this paper we propose methods of obtaining a better preference model by integrating a small amount of real situation data with a large amount of supposed situation data. The methods are evaluated using data regarding food preferences. The experimental results show that the precision of the preference model can be improved significantly.
    Download PDF (998K)
  • Akemi TERA, Kiyoaki SHIRAI, Takaya YUIZONO, Kozo SUGIYAMA
    Article type: PAPER
    Subject area: Knowledge Acquisition
    2008 Volume E91.D Issue 11 Pages 2560-2567
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    In order to investigate reading processes of Japanese language learners, we have conducted an experiment to record eye movements during Japanese text reading using an eye-tracking system. We showed that Japanese native speakers use “forward and backward jumping eye movements” frequently[13],[14]. In this paper, we analyzed further the same eye tracking data. Our goal is to examine whether Japanese learners fix their eye movements at boundaries of linguistic units such as words, phrases or clauses when they start or end “backward jumping”. We consider conventional linguistic boundaries as well as boundaries empirically defined based on the entropy of the N-gram model. Another goal is to examine the relation between the entropy of the N-gram model and the depth of syntactic structures of sentences. Our analysis shows that (1) Japanese learners often fix their eyes at linguistic boundaries, (2) the average of the entropy is the greatest at the fifth depth of syntactic structures.
    Download PDF (1579K)
  • Syed Khairuzzaman TANBEER, Chowdhury Farhan AHMED, Byeong-Soo JEONG, Y ...
    Article type: PAPER
    Subject area: Knowledge Discovery and Data Mining
    2008 Volume E91.D Issue 11 Pages 2568-2577
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    The frequency of a pattern may not be a sufficient criterion for identifying meaningful patterns in a database. The temporal regularity of a pattern can be another key criterion for assessing the importance of a pattern in several applications. A pattern can be said regular if it appears at a regular user-defined interval in the database. Even though there have been some efforts to discover periodic patterns in time-series and sequential data, none of the existing studies have provided an appropriate method for discovering the patterns that occur regularly in a transactional database. Therefore, in this paper, we introduce a novel concept of mining regular patterns from transactional databases. We also devise an efficient tree-based data structure, called a Regular Pattern tree (RP-tree in short), that captures the database contents in a highly compact manner and enables a pattern growth-based mining technique to generate the complete set of regular patterns in a database for a user-defined regularity threshold. Our performance study shows that mining regular patterns with an RP-tree is time and memory efficient, as well as highly scalable.
    Download PDF (582K)
  • Chowdhury Farhan AHMED, Syed Khairuzzaman TANBEER, Byeong-Soo JEONG, Y ...
    Article type: PAPER
    Subject area: Knowledge Discovery and Data Mining
    2008 Volume E91.D Issue 11 Pages 2578-2588
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Even though weighted frequent pattern (WFP) mining is more effective than traditional frequent pattern mining because it can consider different semantic significances (weights) of items, existing WFP algorithms assume that each item has a fixed weight. But in real world scenarios, the weight (price or significance) of an item can vary with time. Reflecting these changes in item weight is necessary in several mining applications, such as retail market data analysis and web click stream analysis. In this paper, we introduce the concept of a dynamic weight for each item, and propose an algorithm, DWFPM (dynamic weighted frequent pattern mining), that makes use of this concept. Our algorithm can address situations where the weight (price or significance) of an item varies dynamically. It exploits a pattern growth mining technique to avoid the level-wise candidate set generation-and-test methodology. Furthermore, it requires only one database scan, so it is eligible for use in stream data mining. An extensive performance analysis shows that our algorithm is efficient and scalable for WFP mining using dynamic weights.
    Download PDF (747K)
  • Hitohiro SHIOZAKI, Koji EGUCHI, Takenao OHKAWA
    Article type: PAPER
    Subject area: Knowledge Discovery and Data Mining
    2008 Volume E91.D Issue 11 Pages 2589-2598
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Conveying information about who, what, when and where is a primary purpose of some genres of documents, typically news articles. Statistical models that capture dependencies between named entities and topics can play an important role. Although some relationships between who and where should be mentioned in such a document, no statistical topic models explicitly address in handling such information the textual interactions between a who-entity and a where-entity. This paper presents a statistical model that directly captures the dependencies between an arbitrary number of word types, such as who-entities, where-entities and topics, mentioned in each document. We show that this multitype topic model performs better at making predictions on entity networks, in which each vertex represents an entity and each edge weight represents how a pair of entities at the incident vertices is closely related, through our experiments on predictions of who-entities and links between them. We also demonstrate the scale-free property in the weighted networks of entities extracted from written mentions.
    Download PDF (517K)
  • Kazuo MISUE
    Article type: PAPER
    Subject area: Knowledge Discovery and Data Mining
    2008 Volume E91.D Issue 11 Pages 2599-2606
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Because network diagrams drawn using the spring embedder are not easy to read, this paper proposes the use of “anchored maps” in which some nodes are fixed as anchors. The readability of network diagrams is discussed, anchored maps are proposed, and a method for drawing anchored maps is explained. The method uses indices to decide the orders of anchors because those orders markedly affect the readability of the network diagrams. Examples showing the effectiveness of the anchored maps are also shown.
    Download PDF (512K)
  • Chui Young YOON
    Article type: PAPER
    Subject area: Knowledge Representation
    2008 Volume E91.D Issue 11 Pages 2607-2615
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    We describe an evaluation system consisting of an evaluation and interpretation model to totally assess and interpret an end-user's computing capability. It includes four evaluation factors and eighteen items, the complex indicators, an evaluation process, and method. We verified the model construct was verified by factor analysis and reliability analysis through a pilot test. We confirmed the application of the developed system by applying the model to evaluating end-users in a computing business environment and presenting the results. This system contributes to developing a practical system for evaluating an end-user's computing capability and hence for improving computing capability of end-users.
    Download PDF (4215K)
  • Rachanee UNGRANGSI, Chutiporn ANUTARIYA, Vilas WUWONGSE
    Article type: PAPER
    Subject area: Knowledge Representation
    2008 Volume E91.D Issue 11 Pages 2616-2625
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    In order to timely response to a user query at run-time, next generation Semantic Web applications demand a robust mechanism to dynamically select one or more existing ontologies available on the Web and combine them automatically if needed. Although existing ontology retrieval systems return a lengthy list of resultant ontologies, they cannot identify which ones can completely meet the query requirements nor determine a minimum set of resultant ontologies that can jointly satisfy the requirements if no single ontology is available to satisfy them. Therefore, this paper presents an ontology retrieval system, namely combiSQORE, which can return single or combinative ontologies that completely satisfy a submitted query when the available ontology database is adequate to answer such query. In addition, the proposed system ranks the returned results based on their semantic similarities to the given query and their modification (integration) costs. The experimental results show that combiSQORE system yields practical combinative ontologies and useful rankings.
    Download PDF (942K)
  • Shinae SHIN, Dongwon JEONG, Doo-Kwon BAIK
    Article type: PAPER
    Subject area: Knowledge Representation
    2008 Volume E91.D Issue 11 Pages 2626-2637
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    We propose an enhanced method for translating Topic Maps to RDF/RDF Schema, to realize the Semantic Web. A critical issue for the Semantic Web is to efficiently and precisely describe Web information resources, i.e., Web metadata. Two representative standards, Topic Maps and RDF have been used for Web metadata. RDF-based standardization and implementation of the Semantic Web have been actively performed. Since the Semantic Web must accept and understand all Web information resources that are represented with the other methods, Topic Maps-to-RDF translation has become an issue. Even though many Topic Maps to RDF translation methods have been devised, they still have several problems (e.g. semantic loss, complex expression, etc.). Our translation method provides an improved solution to these problems. This method shows lower semantic loss than the previous methods due to extract both explicit semantics and implicit semantics. Compared to the previous methods, our method reduces the encoding complexity of resulting RDF. In addition, in terms of reversibility, the proposed method regenerates all Topic Maps constructs in an original source when is reverse translated.
    Download PDF (3997K)
  • Heeryon CHO, Toru ISHIDA, Satoshi OYAMA, Rieko INABA, Toshiyuki TAKASA ...
    Article type: PAPER
    Subject area: Knowledge Applications and Intelligent User Interfaces
    2008 Volume E91.D Issue 11 Pages 2638-2646
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Since participants at both end of the communication channel must share common pictogram interpretation to communicate, the pictogram selection task must consider both participants' pictogram interpretations. Pictogram interpretation, however, can be ambiguous. To assist the selection of pictograms more likely to be interpreted as intended, we propose a categorical semantic relevance measure which calculates how relevant a pictogram is to a given interpretation in terms of a given category. The proposed measure defines similarity measurement and probability of interpretation words using pictogram interpretations and frequencies gathered from a web survey. Moreover, the proposed measure is applied to categorized pictogram interpretations to enhance pictogram retrieval performance. Five pictogram categories used for categorizing pictogram interpretations are defined based on the five first-level classifications defined in the Concept Dictionary of the EDR Electronic Dictionary. Retrieval performances among not-categorized interpretations, categorized interpretations, and categorized and weighted interpretations using semantic relevance measure were compared, and the categorized semantic relevance approaches showed more stable performances than the not-categorized approach.
    Download PDF (688K)
  • Yu SUZUKI, Kazuo MISUE, Jiro TANAKA
    Article type: PAPER
    Subject area: Knowledge Applications and Intelligent User Interfaces
    2008 Volume E91.D Issue 11 Pages 2647-2654
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    A system which employs a stylus as an input device is suitable for creative activities like writing and painting. However, such a system does not always provide the user with a GUI that is easy to operate using the stylus. In addition, system usability is diminished because the stylus is not always integrated into the system in a way that takes into consideration the features of a pen. The purpose of our research is to improve the usability of a system which uses a stylus as an input device. We propose shortcut actions, which are interaction techniques for operation with a stylus that are controlled through a user's hand motions made in the air. We developed the Context Sensitive Stylus as a device to implement the shortcut actions. The Context Sensitive Stylus consists of an accelerometer and a conventional stylus. We also developed application programs to which we applied the shortcut actions; e.g., a drawing tool, a scroll supporting tool, and so on. Results from our evaluation of the shortcut actions indicate that users can concentrate better on their work when using the shortcut actions than when using conventional menu operations.
    Download PDF (898K)
Regular Section
  • Vasutan TUNBUNHENG, Hideharu AMANO
    Article type: PAPER
    Subject area: VLSI Systems
    2008 Volume E91.D Issue 11 Pages 2655-2665
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    For developing design environment of various Dynamically Reconfigurable Processor Arrays (DRPAs), the Graph with Configuration Information (GCI) is proposed to represent configurable resource in the target dynamically reconfigurable architecture. The functional unit, constant unit, register, and routing resource can be represented in the graph as well as the configuration information. The restriction in the hardware is also added in the graph by limiting the possible configuration at a node controlled by the other node. A prototype compiler called Black-Diamond with GCI is now available for three different DRPAs. It translates data-flow graph from C-like front-end description, applies placement and routing by using the GCI, and generates configuration data for each element of the DRPA. Evaluation results of simple applications show that Black-Diamond can generate reasonable designs for all three different architectures. Other target architectures can be easily treated by representing many aspects of architectural property into a GCI.
    Download PDF (918K)
  • Chin-Feng TSAI, Huan-Sheng WANG, King-Chu HUNG, Shih-Chang HSIA
    Article type: PAPER
    Subject area: VLSI Systems
    2008 Volume E91.D Issue 11 Pages 2666-2674
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Wavelet-based features with simplicity and high efficacy have been used in many pattern recognition (PR) applications. These features are usually generated from the wavelet coefficients of coarse levels (i.e., high octaves) in the discrete periodized wavelet transform (DPWT). In this paper, a new 1-D non-recursive DPWT (NRDPWT) is presented for real-time high octave decomposition. The new 1-D NRDPWT referred to as the 1-D RRO-NRDPWT can overcome the word-length-growth (WLG) effect based on two strategies, resisting error propagation and applying a reversible round-off linear transformation (RROLT) theorem. Finite precision performance analysis is also taken to study the word length suppression efficiency and the feature efficacy in breast lesion classification on ultrasonic images. For the realization of high octave decomposition, a segment accumulation algorithm (SAA) is also presented. The SAA is a new folding technique that can reduce multipliers and adders dramatically without the cost of increasing latency.
    Download PDF (931K)
  • Sungjoon JUNG, Tag Gon KIM
    Article type: PAPER
    Subject area: Computer Systems
    2008 Volume E91.D Issue 11 Pages 2675-2684
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Reconfigurable architectures are one of the most promising solutions satisfying both performance and flexibility. However, reconfiguration overhead in those architectures makes them inappropriate for repetitive reconfigurations. In this paper, we introduce a configuration sharing technique to reduce reconfiguration overhead between similar applications using static partial reconfiguration. Compared to the traditional resource sharing that configures multiple temporal partitions simultaneously and employs a time-multiplexing technique, the proposed configuration sharing reconfigures a device incrementally as an application changes and requires a backend adaptation to reuse configurations between applications. Adopting a data-flow intermediate representation, our compiler framework extends a min-cut placer and a negotiation-based router to deal with the configuration sharing. The results report that the framework could reduce 20% of configuration time at the expense of 1.9% of computation time on average.
    Download PDF (341K)
  • Chien-Tsun CHEN, Yu Chin CHENG, Chin-Yun HSIEH
    Article type: PAPER
    Subject area: Fundamentals of Software and Theory of Programs
    2008 Volume E91.D Issue 11 Pages 2685-2692
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Design by Contract (DBC), originated in the Eiffel programming language, is generally accepted as a practical method for building reliable software. Currently, however, few languages have built-in support for it. In recent years, several methods have been proposed to support DBC in Java. We compare eleven DBC tools for Java by analyzing their impact on the developer's programming activities, which are characterized by seven quality attributes identified in this paper. It is shown that each of the existing tools fails to achieve some of the quality attributes. This motivates us to develop ezContract, an open source DBC tool for Java that achieves all of the seven quality attributes. ezContract achieves streamlined integration with the working environment. Notably, standard Java language is used and advanced IDE features that work for standard Java programs can also work for the contract-enabled programs. Such features include incremental compilation, automatic refactoring, and code assist.
    Download PDF (253K)
  • Keiichiro OURA, Heiga ZEN, Yoshihiko NANKAKU, Akinobu LEE, Keiichi TOK ...
    Article type: PAPER
    Subject area: Speech and Hearing
    2008 Volume E91.D Issue 11 Pages 2693-2700
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    In a hidden Markov model (HMM), state duration probabilities decrease exponentially with time, which fails to adequately represent the temporal structure of speech. One of the solutions to this problem is integrating state duration probability distributions explicitly into the HMM. This form is known as a hidden semi-Markov model (HSMM). However, though a number of attempts to use HSMMs in speech recognition systems have been proposed, they are not consistent because various approximations were used in both training and decoding. By avoiding these approximations using a generalized forward-backward algorithm, a context-dependent duration modeling technique and weighted finite-state transducers (WFSTs), we construct a fully consistent HSMM-based speech recognition system. In a speaker-dependent continuous speech recognition experiment, our system achieved about 9.1% relative error reduction over the corresponding HMM-based system.
    Download PDF (531K)
  • Akihiro MORI, Seiichi UCHIDA
    Article type: PAPER
    Subject area: Image Processing and Video Processing
    2008 Volume E91.D Issue 11 Pages 2701-2708
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    This paper introduces a fast image mosaicing technique that does not require costly search on image domain (e.g., pixel-to-pixel correspondence search on the image domain) and the iterative optimization (e.g., gradient-based optimization, iterative optimization, and random optimization) of geometric transformation parameter. The proposed technique is organized in a two-step manner. At both steps, histograms are fully utilized for high computational efficiency. At the first step, a histogram of pixel feature values is utilized to detect pairs of pixels with the same rare feature values as candidates of corresponding pixel pairs. At the second step, a histogram of transformation parameter values is utilized to determine the most reliable transformation parameter value. Experimental results showed that the proposed technique can provide reasonable mosaicing results in most cases with very feasible computations.
    Download PDF (1494K)
  • Takeshi YOSHITOME, Ken NAKAMURA, Jiro NAGANUMA, Yoshiyuki YASHIMA
    Article type: PAPER
    Subject area: Image Processing and Video Processing
    2008 Volume E91.D Issue 11 Pages 2709-2717
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    We propose a flexible video CODEC system for super-high-resolution videos such as those utilizing 4k × 2k pixel. It uses the spatially parallel encoding approach and has sufficient scalability for the target video resolution to be encoded. A video shift and padding function has been introduced to prevent the image quality from being degraded when different active line systems are connected. The switchable cascade multiplexing function of our system enables various super-high-resolutions to be encoded and super-high-resolution video streams to be recorded and played back using a conventional PC. A two-stage encoding method using the complexity of each divided image has been introduced to equalize encoding quality among multiple divided videos. System Time Clock (STC) sharing has also been implemented in this CODEC system to absorb the disparity in the times streams are received between channels. These functions enable highly-efficient, high-quality encoding for super-high-resolution video.
    Download PDF (976K)
  • Shun MATSUI, Kota AOKI, Hiroshi NAGAHASHI
    Article type: PAPER
    Subject area: Computer Graphics
    2008 Volume E91.D Issue 11 Pages 2718-2726
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    In 3D computer graphics, mesh parameterization is a key technique for digital geometry processings such as morphing, shape blending, texture mapping, re-meshing and so on. Most of the previous approaches made use of an identical primitive domain to parameterize a mesh model. In recent works of mesh parameterization, more flexible and attractive methods that can create direct mappings between two meshes have been reported. These mappings are called “cross-parameterization” and typically preserve semantic feature correspondences between target meshes. This paper proposes a novel approach for parameterizing a mesh into another one directly. The main idea of our method is to combine a competitive learning and a least-square mesh techniques. It is enough to give some semantic feature correspondences between target meshes, even if they are in different shapes or in different poses.
    Download PDF (1444K)
  • Yong Hun PARK, Kyoung Soo BOK, Jae Soo YOO
    Article type: LETTER
    Subject area: Database
    2008 Volume E91.D Issue 11 Pages 2727-2730
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    In this paper, we propose a continuous range query processing method over moving objects. To efficiently process continuous range queries, we design a main-memory-based query index that uses smaller storage and significantly reduces the query processing time. We show through performance evaluation that the proposed method outperforms the existing methods.
    Download PDF (330K)
  • Hyunho KANG, Koutarou YAMAGUCHI, Brian KURKOSKI, Kazuhiko YAMAGUCHI, K ...
    Article type: LETTER
    Subject area: Application Information Security
    2008 Volume E91.D Issue 11 Pages 2731-2734
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    For the digital watermarking patchwork algorithm originally given by Bender et al., this paper proposes two improvements applicable to audio watermarking. First, the watermark embedding strength is psychoacoustically adapted, using the Bark frequency scale. Second, whereas previous approaches leave the samples that do not correspond to the data untouched, in this paper, these are modified to reduce the probability of misdetection, a method called full index embedding. In simulations, the proposed combination of these two proposed methods has higher resistance to a variety of attacks than prior algorithms.
    Download PDF (160K)
  • Takeshi SAITOH, Ryosuke KONISHI
    Article type: LETTER
    Subject area: Pattern Recognition
    2008 Volume E91.D Issue 11 Pages 2735-2738
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    This paper describes a recognition method of Japanese single sounds for application to lip reading. Related researches investigated only five or ten sounds. In this paper, experiments were conducted for 45 Japanese single sounds by classifying them into five vowels category, ten consonants category, and 45 sounds category. We obtained recognition rates of 94.7, 30.9 and 30.0% with trajectory feature.
    Download PDF (199K)
  • Xiang ZHANG, Ping LU, Hongbin SUO, Qingwei ZHAO, Yonghong YAN
    Article type: LETTER
    Subject area: Speech and Hearing
    2008 Volume E91.D Issue 11 Pages 2739-2741
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    In this letter, a recently proposed clustering algorithm named affinity propagation is introduced for the task of speaker clustering. This novel algorithm exhibits fast execution speed and finds clusters with low error. However, experiments show that the speaker purity of affinity propagation is not satisfying. Thus, we propose a hybrid approach that combines affinity propagation with agglomerative hierarchical clustering to improve the clustering performance. Experiments show that compared with traditional agglomerative hierarchical clustering, the hybrid method achieves better performance on the test corpora.
    Download PDF (78K)
  • Der-Chang TSENG, Jung-Hui CHIU
    Article type: LETTER
    Subject area: Speech and Hearing
    2008 Volume E91.D Issue 11 Pages 2742-2745
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    Since an FFT-based speech encryption system retains a considerable residual intelligibility, such as talk spurts and the original intonation in the encrypted speech, this makes it easy for eavesdroppers to deduce the information contents from the encrypted speech. In this letter, we propose a new technique based on the combination of an orthogonal frequency division multiplexing (OFDM) scheme and an appropriate QAM mapping method to remove the residual intelligibility from the encrypted speech by permuting several frequency components. In addition, the proposed OFDM-based speech encryption system needs only two FFT operations instead of the four required by the FFT-based speech encryption system. Simulation results are presented to show the effectiveness of this proposed technique.
    Download PDF (453K)
  • Suk-Bong KWON, HoiRin KIM
    Article type: LETTER
    Subject area: Speech and Hearing
    2008 Volume E91.D Issue 11 Pages 2746-2750
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    This paper suggests word voiceprint models to verify the recognition results obtained from a speech recognition system. Word voiceprint models have word-dependent information based on the distributions of phone-level log-likelihood ratio and duration. Thus, we can obtain a more reliable confidence score for a recognized word by using its word voiceprint models that represent the more proper characteristics of utterance verification for the word. Additionally, when obtaining a log-likelihood ratio-based word voiceprint score, this paper proposes a new log-scale normalization function using the distribution of the phone-level log-likelihood ratio, instead of the sigmoid function widely used in obtaining a phone-level log-likelihood ratio. This function plays a role of emphasizing a mis-recognized phone in a word. This individual information of a word is used to help achieve a more discriminative score against out-of-vocabulary words. The proposed method requires additional memory, but it shows that the relative reduction in equal error rate is 16.9% compared to the baseline system using simple phone log-likelihood ratios.
    Download PDF (182K)
  • Soon Hak KWON, Hye Cheun JEONG, Suk Tae SEO, In Keun LEE, Chang Sik SO ...
    Article type: LETTER
    Subject area: Image Recognition, Computer Vision
    2008 Volume E91.D Issue 11 Pages 2751-2753
    Published: November 01, 2008
    Released on J-STAGE: November 28, 2008
    JOURNAL FREE ACCESS
    The thresholding results for gray level images depend greatly on the thresholding method applied. However, this letter proposes a histogram equalization-based thresholding algorithm that makes the thresholding results insensitive to the thresholding method applied. Experimental results are presented to demonstrate the effectiveness of the proposed thresholding algorithm.
    Download PDF (383K)
feedback
Top