Data Science Journal

Contents of Volume 7, 2008

Editorial

Editor's Note: SCIENTIFIC "AGENDA" OF DATA SCIENCE

Shuichi Iwata

2008 Volume 7 Pages 54-56
Published: 2008
Released on J-STAGE: May 27, 2008

DOIhttps://doi.org/10.2481/dsj.7.54

JOURNAL FREE ACCESS

Download PDF (382K)

Papers

Report from the first Workshop on Extremely Large Databases

J Becla, K-T Lim

2008 Volume 7 Pages 1-13
Published: 2008
Released on J-STAGE: February 29, 2008

DOIhttps://doi.org/10.2481/dsj.7.1

JOURNAL FREE ACCESS

Show abstractHide abstract

Industrial and scientific datasets have been growing enormously in size and complexity in recent years. The largest transactional databases and data warehouses can no longer be hosted cost-effectively in off-the-shelf commercial database management systems. There are other forums for discussing databases and data warehouses, but they typically deal with problems occurring at smaller scales and do not always focus on practical solutions or influencing DBMS vendors. Given the relatively small (but highly influential and growing) number of users with these databases and the relatively small number of opportunities to exchange practical information related to DBMSes at extremely large scale, a workshop on extremely large databases was organized. This paper is the final report of the discussions and activities at the workshop.

View full abstract

Download PDF (561K)
Decoding Patent Information Using Patent Maps

Chen-Yuan Liu, James Chingyu Yang

2008 Volume 7 Pages 14-22
Published: 2008
Released on J-STAGE: February 29, 2008

DOIhttps://doi.org/10.2481/dsj.7.14

JOURNAL FREE ACCESS

Show abstractHide abstract

Patent information is a derivative product from the legal patent system. This information, which includes patent applications, patent descriptions, patent gazettes, patent abstracts, and patent data, is prepared in exact compliance with the regulations and specifications of the patent acts. Patent information, different from other published circulating information, is legally well protected. For convenience, this study classifies patent information into bibliographic and numeric data to create a patent map.

View full abstract

Download PDF (757K)
Minimax Estimation of the Parameter of the Rayleigh Distribution under Quadratic Loss Function

Sanku Dey

2008 Volume 7 Pages 23-30
Published: 2008
Released on J-STAGE: February 29, 2008

DOIhttps://doi.org/10.2481/dsj.7.23

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper is concerned with the problem of finding the minimax estimator of the parameter θ of the Rayleigh distribution for quadratic loss function by applying the theorem of Lehmann (1950).

View full abstract

Download PDF (948K)
Applying an Enhanced Technology Acceptance Model to Knowledge Management in Agricultural Extension Services

Olusegun Folorunso, Shawn Oluwafemi Ogunseye

2008 Volume 7 Pages 31-45
Published: 2008
Released on J-STAGE: April 03, 2008

DOIhttps://doi.org/10.2481/dsj.7.31

JOURNAL FREE ACCESS

Show abstractHide abstract

This research investigates the applicability of Davis's Technology Acceptance Model (TAM) to agriculturist's acceptance of a knowledge management system (KMS), developed by the authors. It is called AGROWIT. Although the authors used previous Technology Acceptance Model user acceptance research as a basis for investigation of user acceptance of AGROWIT, the model had to be extended and constructs from the Triandis model that were added increased the predictive results of the TAM, but only slightly. Relationships among primary TAM constructs used are in substantive agreement with those characteristic of previous TAM research. Significant positive relationships between perceived usefulness, ease of use, and system usage were consistent with previous TAM research. The observed mediating role of perceived usefulness in the relationship between ease of use and usage was also in consonance with earlier findings. The findings are significant because they suggest that the considerable body of previous TAM-related information technology research may be usefully applied to the knowledge management domain to promote further investigation of factors affecting the acceptance and usage of knowledge management information systems such as AGROWIT by farmers, extension workers, and agriculture researchers.

View full abstract

Download PDF (876K)
A Method for Content-Based Searching of 3D Model Databases

Jiale Wang, Hongming Cai, Yuanjun He

2008 Volume 7 Pages 46-53
Published: 2008
Released on J-STAGE: April 14, 2008

DOIhttps://doi.org/10.2481/dsj.7.46

JOURNAL FREE ACCESS

Show abstractHide abstract

With the development of computer graphics and digitalizing technologies, 3D model databases are becoming ubiquitous. This paper presents a method for content-based searching for similar 3D models in databases. To assess the similarity between 3D models, shape feature information of models must be extracted and compared. We propose a new 3D shape feature extraction algorithm. Experimental results show that the proposed method achieves good retrieval performance with short computation time.

View full abstract

Download PDF (864K)
Nest and Unnest Operators in Nested Relations

Georgia Garani

2008 Volume 7 Pages 57-64
Published: 2008
Released on J-STAGE: May 27, 2008

DOIhttps://doi.org/10.2481/dsj.7.57

JOURNAL FREE ACCESS

Show abstractHide abstract

By distinguishing nested attributes as Decomposable and Non-Decomposable, it is proved that for all nested relations, unnesting and then renesting on the same attribute yields the original relation subject only to the elimination of duplicate data. Therefore, the statement that was popular in nested relations research: "Unnesting and then nesting on the same attribute of a nested relation does not always yield the original relation" is reconsidered.

View full abstract

Download PDF (551K)
Bayes Estimator of Generalized-Exponential Parameters under Linex Loss Function Using Lindley's Approximation

Rahul Singh, Sanjay Kumar Singh, Umesh Singh, Gyan Prakash Singh

2008 Volume 7 Pages 65-75
Published: 2008
Released on J-STAGE: May 30, 2008

DOIhttps://doi.org/10.2481/dsj.7.65

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we have obtained the Bayes Estimator of Generalized-Exponential scale and shape parameter using Lindley's approximation (L-approximation) under asymmetric loss functions. The proposed estimators have been compared with the corresponding MLE for their risks based on simulated samples from the Generalized-Exponential distribution.

View full abstract

Download PDF (581K)
Discovering Unordered Rule Sets for Mixed Variables Using an Ant-Miner Algorithm

C. Nalini, P. Balasubramanie

2008 Volume 7 Pages 76-87
Published: 2008
Released on J-STAGE: May 27, 2008

DOIhttps://doi.org/10.2481/dsj.7.76

JOURNAL FREE ACCESS

Show abstractHide abstract

This work proposes a data mining algorithm called Unordered Rule Sets using a continuous Ant-Miner algorithm. The goal of this work is to extract classification rules from data. Swarm intelligence (SI) is a technique whereby rules may be discovered through the study of collective behavior in decentralized, self-organized systems, such as ants. The Ant-Miner algorithm, first proposed by Parpinelli and his colleagues (2002), applies an ant colony optimization (ACO) heuristic to the classification task of data mining to discover an ordered list of classification rules. Ant-Miner is a rule-induction algorithm that uses SI techniques to form rules. Ant-Miner uses a discretization process to deal with continuous attributes in the data. Discretization transforms numeric attributes into nominal attributes. Discretization may suffer from a loss of information, as the real relationship underlying individual values of a numeric attribute is unknown. The objective of this work is to apply ACO heuristic techniques to discover unordered rule sets for mixed variables in a data set. The proposed algorithm handles both nominal and continuous attributes using multimodal functions. It has the advantage of discovering more modular rules, i.e., rules that can be interpreted independently from other rules - unlike the rules in an ordered list, where the interpretation of a rule requires knowledge of the previous rules in the list. The results provide evidence that the accuracy of the Unordered Rule Set Continuous Ant-Miner algorithm is competitive with other Ant-Miner versions and generates simpler rule sets.

View full abstract

Download PDF (758K)
Report from the SciDB Workshop

Jacek Becla, Kian-Tat Lim

2008 Volume 7 Pages 88-95
Published: 2008
Released on J-STAGE: November 07, 2008

DOIhttps://doi.org/10.2481/dsj.7.88

JOURNAL FREE ACCESS

Show abstractHide abstract

A mini-workshop with representatives from the data-driven science and database research communities was organized in response to suggestions at the first XLDB Workshop. The goal was to develop common requirements and primitives for a next-generation database management system that scientists would use, including those from high-energy physics, astronomy, biology, geoscience and fusion, in order to stimulate research and advance technology. These requirements were thought by the database researchers to be novel and unlikely to be fully met by current commercial vendors. The two groups accordingly decided to explore building a new open source DBMS. This paper is the final report of the discussions and activities at the workshop

View full abstract

Download PDF (825K)
Visualizing Concurrency Control Algorithms for Real-Time Database Systems

Olusegun Folorunso, H.O.D. Longe, Adio T. Akinwale

2008 Volume 7 Pages 96-105
Published: 2008
Released on J-STAGE: November 07, 2008

DOIhttps://doi.org/10.2481/dsj.7.96

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper describes an approach to visualizing concurrency control (CC) algorithms for real-time database systems (RTDBs). This approach is based on the principle of software visualization, which has been applied in related fields. The Model-View-controller (MVC) architecture is used to alleviate the black box syndrome associated with the study of algorithm behaviour for RTDBs Concurrency Controls. We propose a Visualization "exploratory" tool that assists the RTDBS designer in understanding the actual behaviour of the concurrency control algorithms of choice and also in evaluating the performance quality of the algorithm. We demonstrate the feasibility of our approach using an optimistic concurrency control model as our case study. The developed tool substantiates the earlier simulation-based performance studies by exposing spikes at some points when visualized dynamically that are not observed using usual static graphs. Eventually this tool helps solve the problem of contradictory assumptions of CC in RTDBs.

View full abstract

Download PDF (421K)
Bayes Estimators of Exponential Parameters from a Censored Sample Using a Guessed Estimate

G.P. Singh, S.K. Singh, Umesh Singh, S.K. Upadhyay

2008 Volume 7 Pages 106-114
Published: 2008
Released on J-STAGE: November 07, 2008

DOIhttps://doi.org/10.2481/dsj.7.106

JOURNAL FREE ACCESS

Show abstractHide abstract

This paper provides the Bayes estimators of the failure rate and reliability function for a one-parameter, exponential distribution by utilizing a point guess estimate of the parameter. For deriving the Bayes estimators, the prior distributions are chosen such that they are centered at the known prior values of parameters. The validity of proposed estimators is examined with respect to their maximum likelihood estimators (MLE) and Thompson's Shrinkage estimator on the basis of Monte Carlo simulations of 1000 samples.

View full abstract

Download PDF (534K)
Seminar Cum Meeting Report: Codata Task Group for Exchangeable Material Data Representation to Support Research and Education

T. Ashino, L. Bartolo

2008 Volume 7 Pages 115-124
Published: 2008
Released on J-STAGE: November 07, 2008

DOIhttps://doi.org/10.2481/dsj.7.115

JOURNAL FREE ACCESS

Show abstractHide abstract

On March 4-5, 2008, the CODATA Task Group for Exchangeable Material Data Representation to Support Research and Education held a two day seminar cum meeting at the National Physical Laboratory (NPL), New Delhi, India, with NPL materials researchers and task group members representing material activities and databases from seven countries: European Union (The Czech Republic, France, and the Netherlands), India, Korea, Japan, and the United States. The NPL seminar included presentations about the researchers' work. The Task Group meeting included presentations about current data related activities of the members. Joint discussions between NPL researchers and CODATA task group members began an exchange of viewpoints among materials data producers, users, and databases developers. The seminar cum meeting included plans to continue and expand Task Group activities at the 2008 CODATA 21st Meeting in Kyiv, Ukraine.

View full abstract

Download PDF (708K)
On Shrinkage Estimation for the Scale Parameter of Weibull Distribution

Gyan Prakash, D. C. Singh, S. K. Sinha

2008 Volume 7 Pages 125-136
Published: 2008
Released on J-STAGE: January 08, 2009

DOIhttps://doi.org/10.2481/dsj.7.125

JOURNAL FREE ACCESS

Show abstractHide abstract

.In the present article, some shrinkage testimators for the scale parameter of a two-parameter Weibull life testing model have been suggested under the LINEX loss function assuming the shape parameter is to be known. The comparisons of the proposed testimators have been made with the improved estimator.

View full abstract

Download PDF (1315K)
A Framework for Managing Access of Large-Scale Distributed Resources in a Collaborative Platform

Su Chen, Tiejian Luo, Wei Liu, Jinliang Song, Feng Gao

2008 Volume 7 Pages 137-147
Published: 2008
Released on J-STAGE: January 08, 2009

DOIhttps://doi.org/10.2481/dsj.7.137

JOURNAL FREE ACCESS

Show abstractHide abstract

In an e-Science environment, large-scale distributed resources in autonomous domains are aggregated by unified collaborative platforms to support scientific research across organizational boundaries. In order to enhance the scalability of access management, an integrated approach for decentralizing the task from resource owners to administrators on the platform is needed. We propose an extensible access management framework to meet this requirement by supporting an administrative delegation policy. This feature allows administrators on the platform to make new policies based on the original policies made by resources owners. An access protocol that merges SAML and XACML is also included in the framework. It defines how distributed parties operate with each other to make decentralized authorization decisions.

View full abstract

Download PDF (1108K)
A Note on Bayesian Estimation of the Traffic Intensity in M/M/1 Queue and Queue Characteristics under Quadratic Loss Function

Sanku Dey

2008 Volume 7 Pages 148-154
Published: 2008
Released on J-STAGE: January 08, 2009

DOIhttps://doi.org/10.2481/dsj.7.148

JOURNAL FREE ACCESS

Show abstractHide abstract

Bayes' estimators of the traffic intensity r and various queue characteristics in an M/M/1 queue have been derived under the assumptions of different priors for r and the quadratic error loss function (QELF). Finally, a numerical example is given to illustrate the results

View full abstract

Download PDF (770K)
Some Aspects of the Analysis of Ecological Safety of the Industiral Technologies in the Ukraine

Z. Runovska, G. Chasnik

2008 Volume 7 Pages 155-166
Published: 2008
Released on J-STAGE: January 08, 2009

DOIhttps://doi.org/10.2481/dsj.7.155

JOURNAL FREE ACCESS

Show abstractHide abstract

Some aspects of financial tools for countering climate change under flexible Kyoto mechanisms are studied. Within industry sectors and production processes, data of National GG Cadastre (period 1998 ? 2005) on energy consumption and GG emissions are processed by means of an information-analytical system constructed on the Microstrategy platform. Analysis of the rating of the industrial sectors relative to saved emission allowances enables distributing investment financial flows toward development of innovative technologies with respect to the estimated contribution of each industrial sector to the emission allowances total for the country.

View full abstract

Download PDF (1405K)
Filtration of Spin Wave Signals at Transmission of Data Through a Ferromagnetic Medium

Yu.I. Gorobets, S.A. Reshetnyak, T.A. Khomenko

2008 Volume 7 Pages 167-170
Published: 2008
Released on J-STAGE: January 08, 2009

DOIhttps://doi.org/10.2481/dsj.7.167

JOURNAL FREE ACCESS

Show abstractHide abstract

In this paper, we calculate the dependencies of spin wave reflection intensity on frequency and external magnetic field for a ferrogarnet structure in an exchange mode, for which the influence of the magnetostatic part of the energy is neglected as compared with the exchange part. A ferrogarnet structure is chosen because it has a very small damping parameter and provides high-quality transmission of data.

View full abstract

Download PDF (700K)
Study and Application of Grey Entropy Weight Decision Making in Risk Management

DeYu Kong, Lu Liu, Rui Miaoi, Lu Yin

2008 Volume 7 Pages 171-178
Published: November 04, 2008
Released on J-STAGE: March 05, 2009

DOIhttps://doi.org/10.2481/dsj.7.171

JOURNAL FREE ACCESS

Show abstractHide abstract

In traditional risk evaluation, the weight of a risk index is given in advance, so it lacks objectivity. Using weights and properties generated by entropy concepts, including the idea of information entropy, the comprehensive weight, which can be combined with entropy weight, is calculated. A grey evaluation model of a project risk evaluation index based on comprehensive entropy weight is built. Further, we present empirical research on a real project, which indicates that this approach calculates easily, gives weight scientifically, and provides evaluation accurately.

View full abstract

Download PDF (498K)
Approaches in Using MatML As a Common Language for Materials Data Exchange

T Ojala, H-H Over

2008 Volume 7 Pages 179-195
Published: November 04, 2008
Released on J-STAGE: March 05, 2009

DOIhttps://doi.org/10.2481/dsj.7.179

JOURNAL FREE ACCESS

Show abstractHide abstract

Utilization of XML techniques is seen as a necessary step towards more powerful ways of incorporating semantics into data exchange used by heterogeneous systems. In this paper various techniques are studied and tried, such as XSL transformations (XSLT) and ways of extending the contents of XML Schemas, the final aim being in creating an understanding of the possibilities, and a roadmap that could possibly lead to some useful real-world applications. Based on a materials database an XML Schema is specified that defines the structure of an XML document capable of representing quite complex materials test data together with mandatory metadata. Some approaches are discussed and some of them implemented in prototypes to study the possibilities to comply with and use MatML in order to support sharing of experimentally measured materials data.

View full abstract

Download PDF (1392K)
Report from the 2nd Workshop on Extremely Large Databases

Jacek Becla, Kian-Tat Lim

2008 Volume 7 Pages 196-208
Published: November 04, 2008
Released on J-STAGE: March 05, 2009

DOIhttps://doi.org/10.2481/dsj.7.196

JOURNAL FREE ACCESS

Show abstractHide abstract

The complexity and sophistication of large scale analytics in science and industry have advanced dramatically in recent years. Analysts are struggling to use complex techniques such as time series analysis and classification algorithms because their familiar, powerful tools are not scalable and cannot effectively use scalable database systems. The 2nd Extremely Large Databases (XLDB) workshop was organized to understand these issues, examine their implications, and brainstorm possible solutions. The design of a new open source science database, SciDB that emerged from the first workshop in this series was also debated. This paper is the final report of the discussions and activities at this workshop.

View full abstract

Download PDF (454K)

Errata

A Simple Approach for Data Mining in Delphi

Hewen Tang, Yongsheng Cao

2008 Volume 7 Pages E1
Published: 2008
Released on J-STAGE: April 21, 2008

DOIhttps://doi.org/10.2481/dsj.7.E1

JOURNAL FREE ACCESS

Show abstractHide abstract

The correct affiliations for the authors of this paper are given above.

View full abstract

Download PDF (81K)
A Multimeasurand ISO GUM Supplement is Urgent

V. V. Ezhela

2008 Volume 7 Pages E2
Published: 2008
Released on J-STAGE: June 09, 2008

DOIhttps://doi.org/10.2481/dsj.7.E2

JOURNAL FREE ACCESS

Show abstractHide abstract

Corrections made by the author after publication are listed in the pdf below.

View full abstract

Download PDF (187K)

Register with J-STAGE for free!