Data Science Journal

Papers

A Rough Set Approach for Customer Segmentation

Prabha Dhandayudam, Ilango Krishnamurthi

2014 年 13 巻 p. 1-11
発行日: 2014年
公開日: 2014/04/09
[早期公開] 公開日: 2014/04/03

DOIhttps://doi.org/10.2481/dsj.13-019

ジャーナルフリー

抄録を表示する抄録を非表示にする

Customer segmentation is a process that divides a business's total customers into groups according to their diversity of purchasing behavior and characteristics. The data mining clustering technique can be used to accomplish this customer segmentation. This technique clusters the customers in such a way that the customers in one group behave similarly when compared to the customers in other groups. The customer related data are categorical in nature. However, the clustering algorithms for categorical data are few and are unable to handle uncertainty. Rough set theory (RST) is a mathematical approach that handles uncertainty and is capable of discovering knowledge from a database. This paper proposes a new clustering technique called MADO (Minimum Average Dissimilarity between Objects) for categorical data based on elements of RST. The proposed algorithm is compared with other RST based clustering algorithms, such as MMR (Min-Min Roughness), MMeR (Min Mean Roughness), SDR (Standard Deviation Roughness), SSDR (Standard deviation of Standard Deviation Roughness), and MADE (Maximal Attributes DEpendency). The results show that for the real customer data considered, the MADO algorithm achieves clusters with higher cohesion, lower coupling, and less computational complexity when compared to the above mentioned algorithms. The proposed algorithm has also been tested on a synthetic data set to prove that it is also suitable for high dimensional data.

抄録全体を表示

PDF形式でダウンロード (1800K)
PXML-Miner: A Projection-Based Interesting XML Rule Mining Technique

D Sasikala, K Premalatha

2014 年 13 巻 p. 12-25
発行日: 2014年
公開日: 2014/04/21
[早期公開] 公開日: 2014/04/03

DOIhttps://doi.org/10.2481/dsj.13-017

ジャーナルフリー

抄録を表示する抄録を非表示にする

In recent times, the mining of association rules from XML databases has received attention because of its wide applicability and flexibility. Many mining methods have been proposed. Because of the inherent flexibility of the structures and the semantics of the documents, however, these methods are challenging to use. In order to accomplish the mining, an XML document must first be converted into a relational dataset, and an index table with node encoding is created to extract transactions and interesting items. In this paper, we propose a new method to mine association rules from XML documents using a new type of node encoding scheme that employs a Unique Identifier (UID) to extract the important items. The node scheme modified with UID encoding speeds up the mining process. A significance measure is used to identify the important rules found in the XML database. Finally, the mining procedure calculates the confidence that the identified rules are indeed meaningful. Experiments are conducted using XML databases available in the XML data repository. The results illustrate that the proposed method is efficient in terms of computation time and memory usage.

抄録全体を表示

PDF形式でダウンロード (1356K)
A Semantic-Driven Knowledge Representation Model for the Materials Engineering Application

Xin Cheng, Changjun Hu, Yang Li

2014 年 13 巻 p. 26-44
発行日: 2014年
公開日: 2014/04/27
[早期公開] 公開日: 2014/04/24

DOIhttps://doi.org/10.2481/dsj.13-061

ジャーナルフリー

抄録を表示する抄録を非表示にする

A Materials Engineering Application (MEA) has been presented as a solution for the problems of materials design, solutions simulation, production and processing, and service evaluation. Large amounts of data are generated in the MEA distributed and heterogeneous environment. As the demand for intelligent engineering information applications increases, the challenge is to effectively organize these complex data and provide timely and accurate on-demand services. In this paper, based on the supporting environment of Open Cloud Services Architecture (OCSA) and Virtual DataSpace (VDS), a new semantic-driven knowledge representation model for MEA information is proposed. Faced with the MEA constantly changing user requirements, this model elaborates the semantic representation of data, services and their relationships to support the construction of domain knowledge ontology. Then, based on the ontology modeling in VDS, the semantic representations of association mapping, rule-based reasoning, and evolution tracking are analyzed to support MEA knowledge acquisition. Finally, an application example of knowledge representation in the field of materials engineering is given to illustrate the proposed model, and some experimental comparisons are discussed for evaluating and verifying the effectiveness of this method.

抄録全体を表示

PDF形式でダウンロード (3726K)
A Review of Roads Data Development Methodologies

Taro Ubukawa, Alex de Sherbinin, Harlan Onsrud, Andy Nelson, Karen Pay ...

2014 年 13 巻 p. 45-66
発行日: 2014年
公開日: 2014/05/29
[早期公開] 公開日: 2014/05/15

DOIhttps://doi.org/10.2481/dsj.14-001

ジャーナルフリー

抄録を表示する抄録を非表示にする

There is a clear need for a public domain data set of road networks with high special accuracy and global coverage for a range of applications. The Global Roads Open Access Data Set (gROADS), version 1, is a first step in that direction. gROADS relies on data from a wide range of sources and was developed using a range of methods. Traditionally, map development was highly centralized and controlled by government agencies due to the high cost or required expertise and technology. In the past decade, however, high resolution satellite imagery and global positioning system (GPS) technologies have come into wide use, and there has been significant innovation in web services, such that a number of new methods to develop geospatial information have emerged, including automated and semi-automated road extraction from satellite/aerial imagery and crowdsourcing. In this paper we review the data sources, methods, and pros and cons of a range of road data development methods: heads-up digitizing, automated/semi-automated extraction from remote sensing imagery, GPS technology, crowdsourcing, and compiling existing data sets. We also consider the implications for each method in the production of open data.

抄録全体を表示

PDF形式でダウンロード (2050K)
Automated Quality Evaluation for a More Effective Data Peer Review

A Düsterhus, A Hense

2014 年 13 巻 p. 67-78
発行日: 2014年
公開日: 2014/06/09
[早期公開] 公開日: 2014/06/05

DOIhttps://doi.org/10.2481/dsj.14-009

ジャーナルフリー

抄録を表示する抄録を非表示にする

A peer review scheme comparable to that used in traditional scientific journals is a major element missing in bringing publications of raw data up to standards equivalent to those of traditional publications. This paper introduces a quality evaluation process designed to analyse the technical quality as well as the content of a dataset. This process is based on quality tests, the results of which are evaluated with the help of the knowledge of an expert. As a result, the quality is estimated by a single value only. Further, the paper includes an application and a critical discussion on the potential for success, the possible introduction of the process into data centres, and practical implications of the scheme.

抄録全体を表示

PDF形式でダウンロード (2430K)
Directional Bias of TAO Daily Buoy Wind Vectors in the Central Equatorial Pacific Ocean from November 2008 to January 2010

Ge Peng, Jean-Raymond Bidlot, H Paul Freitag, Carl J Schreck III

2014 年 13 巻 p. 79-87
発行日: 2014年
公開日: 2014/08/12
[早期公開] 公開日: 2014/07/29

DOIhttps://doi.org/10.2481/dsj.14-019

ジャーナルフリー

抄録を表示する抄録を非表示にする

This article documents a systematic bias in surface wind directions between the TAO buoy measurements at 0°, 170°W and the ECMWF analysis and forecasts. This bias was of the order 10° and persisted from November 2008 to January 2010, which was consistent with a post-recovery calibration drift in the anemometer vane. Unfortunately, the calibration drift was too time-variant to be used to correct the data so the quality flag for this deployment was adjusted to reflect low data quality. The primary purpose of this paper is to inform users in the modelling and remote-sensing community about this systematic, persistent wind directional bias, which will allow users to make an educated decision on using the data and be aware of its potential impact to their downstream product quality. The uncovering of this bias and its source demonstrates the importance of continuous scientific oversight and effective user-data provider communication in stewarding scientific data. It also suggests the need for improvement in the ability of buoy data quality control procedures of the TAO and ECMWF systems to detect future wind directional systematic biases such as the one described here.

抄録全体を表示

PDF形式でダウンロード (2308K)
Mediation: The Technological Foundation of Modern Science

Costantino Thanos

2014 年 13 巻 p. 88-105
発行日: 2014年
公開日: 2014/09/14
[早期公開] 公開日: 2014/09/11

DOIhttps://doi.org/10.2481/dsj.14-016

ジャーナルフリー

抄録を表示する抄録を非表示にする

Modern science is increasingly data-intensive, multidisciplinary, and network-centric. There is an emerging consensus among the members of the academic research community that the practices of this new science paradigm should be congruent with “open science”. This entails that the bonanza of research data, the wide availability of algorithms, data tools, and data services produced by the members of the research community must be discoverable, understandable, and usable by overcoming all kinds of heterogeneity and logical inconsistencies. The main concept for coping with the many dimensions of heterogeneity and logical inconsistency is mediation. Mediation is achieved by mediators or brokers. These are software modules that exploit encoded knowledge about certain datasets, data services, and user needs in order to implement an intermediary service. A mediating environment is an environment that provides a core set of intermediary services. Mediation should be a distinct functionality of future research data infrastructures. This paper surveys the different levels of interoperability, i.e., exchangeability, compatibility, and usability, their properties and relationships, mediation concepts, functions, and intermediary services. The current interoperability landscape is also illustrated. Finally, the paper advocates the need for mediating environments to be supported by future research data infrastructures and envisions that one of the most important features of future research data infrastructures will be mediation software.

抄録全体を表示

PDF形式でダウンロード (1288K)
A Meta-Heuristic Regression-Based Feature Selection for Predictive Analytics

Singh Bharat, O P Vyas

2014 年 13 巻 p. 106-118
発行日: 2014年
公開日: 2014/11/14
[早期公開] 公開日: 2014/11/06

DOIhttps://doi.org/10.2481/dsj.14-032

ジャーナルフリー

抄録を表示する抄録を非表示にする

A high-dimensional feature selection having a very large number of features with an optimal feature subset is an NP-complete problem. Because conventional optimization techniques are unable to tackle large-scale feature selection problems, meta-heuristic algorithms are widely used. In this paper, we propose a particle swarm optimization technique while utilizing regression techniques for feature selection. We then use the selected features to classify the data. Classification accuracy is used as a criterion to evaluate classifier performance, and classification is accomplished through the use of k-nearest neighbour (KNN) and Bayesian techniques. Various high dimensional data sets are used to evaluate the usefulness of the proposed approach. Results show that our approach gives better results when compared with other conventional feature selection algorithms.

抄録全体を表示

PDF形式でダウンロード (1394K)
Leveraging Bibliographic RDF Data for Keyword Prediction with Association Rule Mining (ARM)¹

Nidhi Kushwaha, O P Vyas

2014 年 13 巻 p. 119-126
発行日: 2014年
公開日: 2014/11/14
[早期公開] 公開日: 2014/11/06

DOIhttps://doi.org/10.2481/dsj.14-033

ジャーナルフリー

抄録を表示する抄録を非表示にする

The Semantic Web (Web 3.0) has been proposed as an efficient way to access the increasingly large amounts of data on the internet. The Linked Open Data Cloud project at present is the major effort to implement the concepts of the Seamtic Web, addressing the problems of inhomogeneity and large data volumes. RKBExplorer is one of many repositories implementing Open Data and contains considerable bibliographic information. This paper discusses bibliographic data, an important part of cloud data. Effective searching of bibiographic datasets can be a challenge as many of the papers residing in these databases do not have sufficient or comprehensive keyword information. In these cases however, a search engine based on RKBExplorer is only able to use information to retrieve papers based on author names and title of papers without keywords. In this paper we attempt to address this problem by using the data mining algorithm Association Rule Mining (ARM) to develop keywords based on features retrieved from Resource Description Framework (RDF) data within a bibliographic citation. We have demonstrate the applicability of this method for predicting missing keywords for bibliographic entries in several typical databases.
−−−−−
¹ Paper presented at 1st International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2014) March 27-28, 2014. Organized by VIT University, Chennai, India. Sponsored by BRNS.

抄録全体を表示

PDF形式でダウンロード (2686K)
CHISC-AC: Compact Highest Subset Confidence-Based Associative Classification¹

S P Syed Ibrahim, K R Chandran, C J Kabila Kanthasamy

2014 年 13 巻 p. 127-137
発行日: 2014年
公開日: 2014/11/27
[早期公開] 公開日: 2014/11/06

DOIhttps://doi.org/10.2481/dsj.14-035

ジャーナルフリー

抄録を表示する抄録を非表示にする

The associative classification method integrates association rule mining and classification. Constructing an efficient classifier with a small set of high quality rules is a highly important but indeed a challenging task. The lazy learning associative classification method successfully removes the need for a classifier but suffers from high computation costs. This paper proposes a Compact Highest Subset Confidence-Based Associative Classification scheme that generates compact subsets based on information gain and classifies the new samples without constructing classifiers. Experimental results show that the proposed system out performs both the traditional and the existing lazy learning associative classification methods.
−−−−−
¹ Paper presented at 1st International Symposium on Big Data and Cloud Computing Challenges (ISBCC-2014) March 27-28, 2014. Organized by VIT University, Chennai, India. Sponsored by BRNS.

抄録全体を表示

PDF形式でダウンロード (1265K)
A Brief Review on Leading Big Data Models

Sugam Sharma, Udoyara S Tim, Johnny Wong, Shashi Gadia, Subhash Sharma

2014 年 13 巻 p. 138-157
発行日: 2014年
公開日: 2014/12/04
[早期公開] 公開日: 2014/11/24

DOIhttps://doi.org/10.2481/dsj.14-041

ジャーナルフリー

抄録を表示する抄録を非表示にする

Today, science is passing through an era of transformation, where the inundation of data, dubbed data deluge is influencing the decision making process. The science is driven by the data and is being termed as data science. In this internet age, the volume of the data has grown up to petabytes, and this large, complex, structured or unstructured, and heterogeneous data in the form of “Big Data” has gained significant attention. The rapid pace of data growth through various disparate sources, especially social media such as Facebook, has seriously challenged the data analytic capabilities of traditional relational databases. The velocity of the expansion of the amount of data gives rise to a complete paradigm shift in how new age data is processed. Confidence in the data engineering of the existing data processing systems is gradually fading whereas the capabilities of the new techniques for capturing, storing, visualizing, and analyzing data are evolving. In this review paper, we discuss some of the modern Big Data models that are leading contributors in the NoSQL era and claim to address Big Data challenges in reliable and efficient ways. Also, we take the potential of Big Data into consideration and try to reshape the original operationaloriented definition of “Big Science” (Furner, 2003) into a new data-driven definition and rephrase it as “The science that deals with Big Data is Big Science.”

抄録全体を表示

PDF形式でダウンロード (3125K)
Variables As Currency: Linking Meta-Analysis Research and Data Paths in Sciences

Hua Qin, Lynne Davis, Matthew Mayernik, Patricia Romero Lankao, John D ...

2014 年 13 巻 p. 158-171
発行日: 2014年
公開日: 2014/12/04
[早期公開] 公開日: 2014/11/26

DOIhttps://doi.org/10.2481/dsj.14-030

ジャーナルフリー

抄録を表示する抄録を非表示にする

Meta-analyses are studies that bring together data or results from multiple independent studies to produce new and over-arching findings. Current data curation systems only partially support meta-analytic research. Some important meta-analytic tasks, such as the selection of relevant studies for review and the integration of research datasets or findings, are not well supported in current data curation systems. To design tools and services that more fully support meta-analyses, we need a better understanding of meta-analytic research. This includes an understanding of both the practices of researchers who perform the analyses and the characteristics of the individual studies that are brought together. In this study, we make an initial contribution to filling this gap by developing a conceptual framework linking meta-analyses with data paths represented in published articles selected for the analysis. The framework focuses on key variables that represent primary/secondary datasets or derived socio-ecological data, contexts of use, and the data transformations that are applied. We introduce the notion of using variables and their relevant information (e.g., metadata and variable relationships) as a type of currency to facilitate synthesis of findings across individual studies and leverage larger bodies of relevant source data produced in small science research. Handling variables in this manner provides an equalizing factor between data from otherwise disparate data-producing communities. We conclude with implications for exploring data integration and synthesis issues as well as system development.

抄録全体を表示

PDF形式でダウンロード (1510K)
IBAMar Database: Four Decades of Sampling on the Western Mediterranean Sea

A Aparicio-González, J L López-Jurado, R Balbín, J C Alonso, B Amengua ...

2015 年 13 巻 p. 172-191
発行日: 2015年
公開日: 2015/01/27
[早期公開] 公開日: 2015/01/10

DOIhttps://doi.org/10.2481/dsj.14-020

ジャーナルフリー

抄録を表示する抄録を非表示にする

IBAMar is a regional database that puts together all the physical and biochemical data provided by multiparametric probes and water sample analysis taken during the cruises managed by the Balearic Oceanographic Center of the Instituto Español de Oceanografía (COB-IEO) during the last four decades. Initially, it integrated data from hydrographic profiles obtained from CTDs (conductivity, temperature, depth) equipped with several sensors, but it has been recently extended to incorporate data obtained with hydrocasts using oceanographic Niskin or Nansen bottles. The result is an extensive regional resource database that includies physical hydrographic data such as temperature (T), salinity (S), dissolved oxygen (DO), fluorescence, and turbidity, as well as biochemical data, specifically dissolved inorganic nutrients (phosphate, nitrate, nitrite, and silicate) and chlorophyll-a. Different technologies and methodologies were used by independent teams during the four decades of data sampling. However in the IBAMar database, data have been reprocessed using the same protocols and a standard quality control (QC) methodology has been applied to each variable. The result is a homogeneous and quality-controlled data. IBAMar database at standard levels is freely available for exploration and download from http://www.ba.ieo.es/ibamar/.

抄録全体を表示

PDF形式でダウンロード (2208K)
Data-PE: A Framework for Evaluating Data Publication Policies at Scholarly Journals

N Moles

2015 年 13 巻 p. 192-202
発行日: 2015年
公開日: 2015/01/27
[早期公開] 公開日: 2015/01/19

DOIhttps://doi.org/10.2481/dsj.14-047

ジャーナルフリー

抄録を表示する抄録を非表示にする

With the growing importance of data to the scholarly record and the critical role journals play in facilitating data sharing, the complex landscape of scholarly journal data publication policies has become an obstacle for research. This paper outlines Data-PE, a framework for evaluating these policies. It takes the form of a conceptual foundation, comprising twelve criteria for evaluation, operationalized through an evaluation tool. Its objective is to function as a flexible means for a variety of stakeholders to appraise individual policies. Examples of the use of the framework are provided and means for the validation of the tool are discussed.

抄録全体を表示

PDF形式でダウンロード (2125K)
Access to Scientific Data in the 21st Century: Rationale and Illustrative Usage Rights Review

James Campbell

2015 年 13 巻 p. 203-230
発行日: 2015年
公開日: 2015/01/27
[早期公開] 公開日: 2015/01/19

DOIhttps://doi.org/10.2481/dsj.14-043

ジャーナルフリー

抄録を表示する抄録を非表示にする

Making scientific data openly accessible and available for re-use is desirable to encourage validation of research results and/or economic development. Understanding what users may, or may not, do with data in online data repositories is key to maximizing the benefits of scientific data re-use. Many online repositories that allow access to scientific data indicate that data is “open,” yet specific usage conditions reviewed on 40 “open” sites suggest that there is no agreed upon understanding of what “open” means with respect to data. This inconsistency can be an impediment to data re-use by researchers and the public.

抄録全体を表示

PDF形式でダウンロード (2608K)
A Unified Framework for Measuring Stewardship Practices Applied to Digital Environmental Datasets

Ge Peng, Jeffrey L Privette, Edward J Kearns, Nancy A Ritchey, Steve A ...

2015 年 13 巻 p. 231-253
発行日: 2015年
公開日: 2015/02/02
[早期公開] 公開日: 2015/01/27

DOIhttps://doi.org/10.2481/dsj.14-049

ジャーナルフリー

抄録を表示する抄録を非表示にする

This paper presents a stewardship maturity assessment model in the form of a matrix for digital environmental datasets. Nine key components are identified based on requirements imposed on digital environmental data and information that are cared for and disseminated by U.S. Federal agencies by U.S. law, i.e., Information Quality Act of 2001, agencies’ guidance, expert bodies’ recommendations, and users. These components include: preservability, accessibility, usability, production sustainability, data quality assurance, data quality control/monitoring, data quality assessment, transparency/traceability, and data integrity. A five-level progressive maturity scale is then defined for each component associated with measurable practices applied to individual datasets, representing Ad Hoc, Minimal, Intermediate, Advanced, and Optimal stages. The rationale for each key component and its maturity levels is described. This maturity model, leveraging community best practices and standards, provides a unified framework for assessing scientific data stewardship. It can be used to create a stewardship maturity scoreboard of dataset(s) and a roadmap for scientific data stewardship improvement or to provide data quality and usability information to users, stakeholders, and decision makers.

抄録全体を表示

PDF形式でダウンロード (1076K)
The Geospatial Data Cloud: An Implementation of Applying Cloud Computing in Geosciences

Wang Xuezhi, Zhao Jianghua, Zhou Yuanchun, Li Jianhui

2014 年 13 巻 p. 254-264
発行日: 2014年
公開日: 2015/03/23
[早期公開] 公開日: 2014/11/06

DOIhttps://doi.org/10.2481/dsj.14-042

ジャーナルフリー

抄録を表示する抄録を非表示にする

The rapid growth in the volume of remote sensing data and its increasing computational requirements bring huge challenges for researchers as traditional systems cannot adequately satisfy the huge demand for service. Cloud computing has the advantage of high scalability and reliability, which can provide firm technical support. This paper proposes a highly scalable geospatial cloud platform named the Geospatial Data Cloud, which is constructed based on cloud computing. The architecture of the platform is first introduced, and then two subsystems, the cloud-based data management platform and the cloud-based data processing platform, are described.
–––
This paper was presented at the First Scientific Data Conference on Scientific Research, Big Data, and Data Science, organized by CODATA-China and held in Beijing on 24-25 February, 2014.

抄録全体を表示

PDF形式でダウンロード (1557K)

Special Issue

Highlights of the 2013 International Forum on 'Polar Data Activities in Global Data Systems'.

The Polar Data Catalogue: Best Practices for Sharing and Archiving Canada's Polar Data

J E Friddell, E F LeDrew, W F Vincent

2014 年 13 巻 p. PDA1-PDA7
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/23

DOIhttps://doi.org/10.2481/dsj.IFPDA-01

ジャーナルフリー

抄録を表示する抄録を非表示にする

The Polar Data Catalogue (PDC) is a growing Canadian archive and public access portal for Arctic and Antarctic research and monitoring data. In partnership with a variety of Canadian and international multi-sector research programs, the PDC encompasses the natural, social, and health sciences. From its inception, the PDC has adopted international standards and best practices to provide a robust infrastructure for reliable security, storage, discoverability, and access to Canada’s polar data and metadata. Current efforts focus on developing new partnerships and incentives for data archiving and sharing and on expanding connections to other data centres through metadata interoperability protocols.

抄録全体を表示

PDF形式でダウンロード (892K)
Managing Antarctic Data-A Practical Use Case

K Finney

2014 年 13 巻 p. PDA8-PDA14
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/23

DOIhttps://doi.org/10.2481/dsj.IFPDA-02

ジャーナルフリー

抄録を表示する抄録を非表示にする

Scientific data management is performed to ensure that data are curated in a manner that supports their qualified reuse. Curation usually involves actions that must be performed by those who capture or generate data and by a facility with the capability to sustainably archive and publish data beyond an individual project’s lifecycle. The Australian Antarctic Data Centre is such a facility. How this centre is approaching the administration of Antarctic science data is described in the following paper and serves to demonstrate key facets necessary for undertaking polar data management in an increasingly connected global data environment.

抄録全体を表示

PDF形式でダウンロード (1098K)
Establishing Korean Polar Data Management Policy and Its Future Directions

D Jin, M C Lee

2014 年 13 巻 p. PDA15-PDA19
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/23

DOIhttps://doi.org/10.2481/dsj.IFPDA-03

ジャーナルフリー

抄録を表示する抄録を非表示にする

Korea implemented its Antarctic research program in 1987 and diversified to the Arctic in 2002. Since the development of the Joint Committee on Antarctic Data Management, Korea has acknowledged the importance of data management. The launch of the Korea Polar Research Institute in 2004 also saw establishment of the Korea Polar Data Center (KPDC), which outlines and executes a Polar Data Management Policy. KPDC has set up an Information Technology infrastructure and has developed a metadata management system. However, there is still a long way to go, especially in terms of raising researcher recognition for improving data registration and sharing.

抄録全体を表示

PDF形式でダウンロード (722K)
Conceptual View Representation of the Brazilian Information System on Antarctic Environmental Research

R Zorrilla, M Poltosi, L Gadelha, F Porto, A Moura, A Dalto, H P Lavra ...

2014 年 13 巻 p. PDA20-PDA26
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/23

DOIhttps://doi.org/10.2481/dsj.IFPDA-04

ジャーナルフリー

抄録を表示する抄録を非表示にする

Data generated by environmental research in Antarctica are essential in evaluating how its biodiversity and environment are affected by global-scale changes triggered by ever-increasing human activities. In this work, we describe BrAntIS, the Brazilian Information System on Antarctic Environmental Research, which enables the acquiring, storing, and querying of research data generated by the Brazilian National Institute for Science and Technology on Antarctic Environmental Research. BrAntIS' data model reflects data acquisition and analysis conducted by scientists and organized around field expeditions. We describe future functionalities, such as the use of linked data techniques and support for scientific workflows.

抄録全体を表示

PDF形式でダウンロード (881K)
Metadata Management at the Polar Data Centre of the National Institute of Polar Research, Japan

M Kanao, M Okada, A Kadokura

2014 年 13 巻 p. PDA27-PDA31
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/30

DOIhttps://doi.org/10.2481/dsj.IFPDA-05

ジャーナルフリー

抄録を表示する抄録を非表示にする

The Polar Data Centre of the National Institute of Polar Research has had the responsibility to manage the data for Japan as a National Antarctic Data Centre for the last two decades. During the International Polar Year (IPY) 2007–2008, a considerable number of multidisciplinary metadata that mainly came from IPY-endorsed projects involving Japanese activities were compiled by the data centre. Although long-term stewardship of those amalgamated metadata falls to the data centre, the efforts are in collaboration with the Global Change Master Directory, the Polar Information Commons, and the newly established World Data System of the International Council for Science.

抄録全体を表示

PDF形式でダウンロード (1153K)
From Data to Publications: The Polar Information Spectrum

S Vossepoel

2014 年 13 巻 p. PDA32-PDA36
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/30

DOIhttps://doi.org/10.2481/dsj.IFPDA-06

ジャーナルフリー

抄録を表示する抄録を非表示にする

Polar information falls into at least six categories: information about researchers, organizations, research facilities, research projects, research datasets, and publications. The management of polar research datasets has been the focus of significant attention in recent years, but it is only one piece of the polar information world. The other information types are needed to provide context to, and extract knowledge from, the raw data.Here, I discuss the possibilities for linking the various types of information categories in Canada to create a truly holistic view of Canadian Arctic research.

抄録全体を表示

PDF形式でダウンロード (703K)
Interuniversity Upper Atmosphere Global Observation Network (IUGONET) Meta-Database and Analysis Software

A Yatagai, Y Tanaka, S Abe, A Shinbori, M Yagi, S UeNo, Y Koyama, N Um ...

2014 年 13 巻 p. PDA37-PDA43
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/30

DOIhttps://doi.org/10.2481/dsj.IFPDA-07

ジャーナルフリー

抄録を表示する抄録を非表示にする

An overview of the Interuniversity Upper atmosphere Global Observation NETwork (IUGONET) project is presented. This Japanese program is building a meta-database for ground-based observations of the Earth’s upper atmosphere, in which metadata connected with various atmospheric radars and photometers, including those located in both polar regions, are archived. By querying the metadata database, researchers are able to access data file/information held by data facilities. Moreover, by utilizing our analysis software, users can download, visualize, and analyze upper-atmospheric data archived in or linked with the system. As a future development, we are looking to make our database interoperable with others.

抄録全体を表示

PDF形式でダウンロード (1328K)
Antarctic Space Weather Data Managed by IPS Radio and Space Services of Australia

K Wang, D Neudegg, C Yuile, M Terkildsen, R Marshall, M Hyde, G Patter ...

2014 年 13 巻 p. PDA44-PDA50
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/09/30

DOIhttps://doi.org/10.2481/dsj.IFPDA-08

ジャーナルフリー

抄録を表示する抄録を非表示にする

Ionospheric Prediction Services (IPS) has an extensive collection of data from Antarctic field instruments, the oldest being ionospheric recordings from the 1950s. Its sensor network (IPSNET) spans Australasia and Antarctica collecting information on space weather. In Antarctica, sensors include ionosondes, magnetometers, riometers, and cosmic ray detectors. The (mostly) real-time data from these sensors flow into the IPS World Data Centre at Sydney, where the majority are available online to clients worldwide. When combined with other IPSNET-station data, they provide the basis for Antarctic space weather reports. This paper summarizes the datasets collected from Antarctica and their data management within IPS.

抄録全体を表示

PDF形式でダウンロード (1214K)
Operation of a Data Acquisition, Transfer, and Storage System for the Global Space-Weather Observation Network

T Nagatsuma, K T Murata, K Yamamoto, T Tsugawa, H Kitauchi, T Kondo, ...

2014 年 13 巻 p. PDA51-PDA56
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/10

DOIhttps://doi.org/10.2481/dsj.IFPDA-09

ジャーナルフリー

抄録を表示する抄録を非表示にする

A system to optimize the management of global space-weather observation networks has been developed by the National Institute of Information and Communications Technology (NICT). Named the WONM (Wide-area Observation Network Monitoring) system, it enables data acquisition, transfer, and storage through connection to the NICT Science Cloud, and has been supplied to observatories for supporting space-weather forecast and research. This system provides us with easier management of data collection than our previously employed systems by means of autonomous system recovery, periodical state monitoring, and dynamic warning procedures. Operation of the WONM system is introduced in this report.

抄録全体を表示

PDF形式でダウンロード (1004K)
Hydrometeorological Database (HMDB) for Practical Research in Ecology

A Novakovskiy, V Elsakov

2014 年 13 巻 p. PDA57-PDA63
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/10

DOIhttps://doi.org/10.2481/dsj.IFPDA-10

ジャーナルフリー

抄録を表示する抄録を非表示にする

The regional HydroMeteorological DataBase (HMDB) was designed for easy access to climate data via the Internet. It contains data on various climatic parameters (temperature, precipitation, pressure, humidity, and wind strength and direction) from 190 meteorological stations in Russia and bordering countries for a period of instrumental observations of over 100 years. Open sources were used to ingest data into HMDB. An analytical block was also developed to perform the most common statistical analysis techniques.

抄録全体を表示

PDF形式でダウンロード (1702K)
'Unstructured Data' Practices in Polar Institutions and Networks: a Case Study with the Arctic Options Project

Paul Arthur Berkman

2014 年 13 巻 p. PDA64-PDA71
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/10

DOIhttps://doi.org/10.2481/dsj.IFPDA-11

ジャーナルフリー

抄録を表示する抄録を非表示にする

Arctic Options: Holistic Integration for Arctic Coastal-Marine Sustainability is a new three-year research project to assess future infrastructure associated with the Arctic Ocean regarding: (1) natural and living environment; (2) built environment; (3) natural resource development; and (4) governance. For the assessments, Arctic Options will generate objective relational schema from numeric data as well as textual data. This paper will focus on the ‘long tail of smaller, heterogeneous, and often unstructured datasets’ that ‘usually receive minimal data management consideration’,as observed in the 2013 Communiqué from the International Forum on Polar Data Activities in Global Data Systems.

抄録全体を表示

PDF形式でダウンロード (1073K)
Assembling an Arctic Ocean Boundary Monitoring Array

T Tsubouchi

2014 年 13 巻 p. PDA72-PDA78
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/10

DOIhttps://doi.org/10.2481/dsj.IFPDA-12

ジャーナルフリー

抄録を表示する抄録を非表示にする

The Arctic Ocean boundary monitoring array has been maintained over many years by six research institutes located worldwide. Our approach to Arctic Ocean boundary measurements is generating significant scientific outcomes. However, it is not always easy to access Arctic data. On the basis of our last five years’ experience of assembling pan-Arctic boundary data, and considering the success of Argo, I propose that Arctic data policy should be driven by specific scientific-based requirements. Otherwise, it will be hard to implement the International Polar Year data policy. This approach would also help to establish a consensus of future Arctic science.

抄録全体を表示

PDF形式でダウンロード (1021K)
Building on the International Polar Year: Discovering Interdisciplinary Data Through Federated Search

L Yarmey, S L Khalsa

2014 年 13 巻 p. PDA79-PDA82
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/17

DOIhttps://doi.org/10.2481/dsj.IFODA-13

ジャーナルフリー

抄録を表示する抄録を非表示にする

The legacy of the International Polar Year 2007–2008 (IPY) includes advances in open data and meaningful progress towards interoperability of data, systems, and standards. Enabled by metadata brokering technologies and by the growing adoption of international metadata standards, federated data search welcomes diversity in Arctic data and recognizes the value of expertise in community data repositories. Federated search enables specialized data holdings to be discovered by broader audiences and complements the role of metadata registries such as the Global Change Master Directory, providing interoperability across the Arctic web-of-repositories.

抄録全体を表示

PDF形式でダウンロード (987K)
The Inter-university Consortium for Political and Social Research and the Data Seal of Approval: Accreditation Experiences, Challenges, and Opportunities

M Vardigan, J Lyle

2014 年 13 巻 p. PDA83-PDA87
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/17

DOIhttps://doi.org/10.2481/dsj.IFPDA-14

ジャーナルフリー

抄録を表示する抄録を非表示にする

The Inter-university Consortium for Political and Social Research (ICPSR), a domain repository with a 50-year track record of archiving social and behavioural science data, applied for—and acquired—the Data Seal of Approval (DSA) in 2010. DSA is a non-intrusive, straightforward approach to assessing organizational, technical, and operational infrastructure, and signifies a basic level of accreditation. DSA assessment helped ICPSR become more transparent, monitor and improve archival processes and procedures, and raise awareness within the organization and beyond about best practices for repositories. We relate our experiences with the DSA process, and describe challenges and opportunities associated with DSA assessment.

抄録全体を表示

PDF形式でダウンロード (1219K)
Learning from the International Polar Year to Build the Future of Polar Data Management

M Mokrane, M A Parsons

2014 年 13 巻 p. PDA88-PDA93
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/17

DOIhttps://doi.org/10.2481/dsj.IFPDA-15

ジャーナルフリー

抄録を表示する抄録を非表示にする

The research data landscape of the last International Polar Year was dramatically different from its predecessors. Data scientists documented lessons learned about management of large, diverse, and interdisciplinary datasets to inform future development and practices. Improved, iterative, and adaptive data curation and system development methods to address these challenges will be facilitated by building collaborations locally and globally across the ‘data ecosystem’, thus, shaping and sustaining an international data infrastructure to fulfil modern scientific needs and societal expectations. International coordination is necessary to achieve convergence between domain-specific data systems and hence enable multidisciplinary approaches needed to solve the Global Challenges.

抄録全体を表示

PDF形式でダウンロード (844K)
Towards an International Polar Data Coordination Network

P L Pulsifer, L Yarmey, Ø Godøy, J Friddell, M Parsons, W F Vincent, T ...

2014 年 13 巻 p. PDA94-PDA102
発行日: 2014年
公開日: 2014/10/30
[早期公開] 公開日: 2014/10/17

DOIhttps://doi.org/10.2481/dsj.IFPDA-16

ジャーナルフリー

抄録を表示する抄録を非表示にする

Data management is integral to sound polar science. Through analysis of documents reporting on meetings of the Arctic data management community, a set of priorities and strategies are identified. These include the need to improve data sharing, make use of existing resources, and better engage stakeholders. Network theory is applied to a preliminary inventory of polar and global data management actors to improve understanding of the emerging community of practice. Under the name the Arctic Data Coordination Network, we propose a model network that can support the community in achieving their goals through improving connectivity between existing actors.

抄録全体を表示

PDF形式でダウンロード (1328K)

J-STAGEへの登録はこちら（無料）