Environment Climate Data Sweden (ECDS) is a new Swedish research infrastructure, furthering the reuse of scientific data in the domains of environment and climate. ECDS consists of a technical infrastructure and a service organization, supporting the management, exchange, and re-use of scientific data. The technical components of ECDS include a portal and an underlying data catalogue with information on datasets. The datasets are described using a metadata profile compliant with international standards. The datasets accessible through ECDS can be hosted by universities, institutes, or research groups or at the new Swedish federated data storage facility Swestore of the Swedish National Infrastructure for Computing (SNIC).
Several scientific communities relying on e-science infrastructures are in need of persistent identifiers for data and contextual information. In this article, we present a framework for persistent identification that fundamentally supports context information. It is installed as a number of low-level requirements and abstract data type descriptions, flexible enough to envelope context information while remaining compatible with existing definitions and infrastructures. The abstract data type definitions we draw from the requirements and exemplary use cases can act as an evaluation tool for existing implementations or as a blueprint for future persistent identification infrastructures. A prototypic implementation based on the Handle System is briefly introduced. We also lay the groundwork for establishing a graph of persistent entities that can act as a base layer for more sophisticated information schemas to preserve context information.
Petascale data management and analysis remain one of the main unresolved challenges in today's computing. The 6th Extremely Large Databases workshop was convened alongside the XLDB conference to discuss the challenges in the health care, biology, and natural resources communities. The role of cloud computing, the dominance of file-based solutions in science applications, in-situ and predictive analysis, and commercial software use in academic environments were discussed in depth as well. This paper summarizes the discussions of this workshop.
In this paper we consider the Bayesian estimators for the unknown parameters of Gumbel type-II distribution. The Bayesian estimators cannot be obtained in closed forms. Approximate Bayesian estimators are computed using the idea of Lindley’s approximation under different loss functions. The approximate Bayes estimates obtained under the assumption of non-informative priors are compared with their maximum likelihood counterparts using Monte Carlo simulation. A real data set is analyzed for illustrative purpose.
A GIS for ocean data applications named "Ocean Data and Information Systems (ODIS)" was designed and developed. The system is based on the University of Minnesota MapServer, an open source platform for publishing spatial data and interactive mapping applications to the web with MySQL as the backend database server. This paper discusses some of the details of the storage and organization of oceanographic data, methods employed for visualization of parameter plots, and mapping of the data. ODIS is conceived to be an end-to-end system comprising acquisition of data from a variety of heterogeneous ocean platforms, processing, integration, quality control, and web-based dissemination to users for operational and research activities. ODIS provides efficient data management and potential mapping and visualization functions for oceanographic data.
The value of data in society is increasing rapidly. Organisations that work with data should have standard practices in place to ensure successful curation of data. The World Data System (WDS) consists of a number of data centres responsible for curating research data sets for the scientific community. The WDS has no formal data curation framework or model in place to act as a guideline for member data centres. The objective of this research was to develop a framework for the curation of data in the WDS. A multiple-case case study was conducted. Interviews were used to gather qualitative data and analysis of the data, which led to the development of this framework. The proposed framework is largely based on the Open Archival Information System (OAIS) functional model and caters for the curation of both analogue and digital data.
New high-throughput scientific instruments, telescopes, satellites, accelerators, supercomputers, sensor networks, and running simulations are generating massive amounts of data. In order to be able to exploit these huge volumes of data, a new type of e-infrastructure, the Global Research Data Infrastructure (GRDI), must be developed for harnessing the accumulating data and knowledge produced by the communities of research. This paper identifies the main challenges faced by the future GRDIs, defines a conceptual framework for GRDIs based on the ecosystem metaphor, describes a core set of functionality that these GRDIs must provide, and gives a set of recommendations for building the future GRDIs.
The aim of this paper is to discuss the organizational architecture and standard system for sharing research data at the national level. The Data Sharing Network of Earth System Science (DSNESS) is one of the nine pilot projects of the Scientific Data Sharing Project in China that has become a long-term operational research data-sharing platform in the National Science and Technology Infrastructure (NSTI) of China. First, a data sharing union mechanism was designed with the core principle being, “data come from research and will be reused in research”. Second, a data sharing organizational architecture was constructed that consists of three sections: data resource architecture, data management architecture, and data services architecture. A physical data sharing network was constructed that includes one general center and 15 distributed sub-centers based on the architecture. Third, a series of data sharing standards and specifications were designed and implemented in the DSNESS. The reference model of the DSNESS standard system includes three levels of standards: directive standards, general standards, and application standards. In total, 21 high level standards and specifications were developed and implemented in the DSNESS. Several core standards and specifications, such as the extensible metadata standard, data quality control specifications, and so on, were analyzed in detail. Finally, the data service effect was summarized in three aspects: dataset services, standard and specification services, and international cooperation services. This research shows that the organizational architecture and standard system is a very important soft environment for research data sharing. The practices of DSNESS will provide useful experiences for multi-disciplinary data sharing in Earth science and will help to eliminate the data gap between the rich and poor at the national level.
We document the geographical and temporal distributions of oceanographic vertical profile observations made during World War II (1939-1945) that are included in the " World Ocean Database " (WOD). The WOD is a product of the NOAA/National Oceanographic Data Center, USA and its co-located ICSU World Data Center for Oceanography. The WOD is the largest collection of ocean profile data available internationally without restriction. All data shown in this paper are available online without restriction and at no cost. The WOD is built upon the international exchange of oceanographic data with contributions of data received from many countries. Most of the data shown in this paper and the data within the WOD in which these data reside in a uniform format were gathered under the auspices of the International Oceanographic Data and Information Exchange (IODE) committee of the Intergovernmental Oceanographic Commission (IOC) of UNESCO and the ICSU (International Council of Science) World Data Center system, which is now part of the ICSU World Data System. The WOD contains 112,714 ocean station data casts and 45,003 mechanical bathythermograph profiles for 1939-1945
Among the key services that institutional data management infrastructures must provide are provenance and lineage tracking and the ability to associate data with contextual information needed for understanding and use. These functionalities are critical for addressing a number of key issues faced by data collectors and users, including trust in data, results traceability, data transparency, and data citation support. In this paper, we describe the support for these services within the Data Conservancy Service (DCS) software. The DCS provenance, context, and lineage services cross the four layers in the DCS data curation stack model: storage, archiving, preservation, and curation.
Data sharing has gained importance in scientific communities because scientific associations and funding organizations require long term preservation and dissemination of data. To support psychology researchers in data archiving and data sharing, the Leibniz Institute for Psychology Information developed an archiving facility for psychological research data in Germany: PsychData. In this paper we report different types of data requests that were sent to researchers with the aim of building up a sustainable data archive. Resulting response rates were rather low, however, comparable to those published by other authors. Possible reasons for the reluctance of researchers to submit data are discussed.
Materials failure indicates the fault with materials or components during their performance. To avoid the reoccurrence of similar failures, materials failure analysis is executed to investigate the reasons for the failure and to propose improved strategies. The whole procedure needs sufficient domain knowledge and also produces valuable new knowledge. However, the information about the materials failure analysis is usually retained by the domain expert, and its sharing is technically difficult. This phenomenon may seriously reduce the efficiency and decrease the veracity of the failure analysis. To solve this problem, this paper adopts ontology, a novel technology from the Semantic Web, as a tool for knowledge representation and sharing and describes the construction of the ontology to obtain information concerning the failure analysis, application area, materials, and failure cases. The ontology represented information is machine-understandable and can be easily shared through the Internet. At the same time, failure case intelligent retrieval, advanced statistics, and even automatic reasoning can be accomplished based on ontology represented knowledge. Obviously this can promote the knowledge sharing of materials service safety and improve the efficiency of failure analysis. The case of a nuclear power plant area is presented to show the details and benefits of this method.
Persistent Identifiers (PIDs) have lately received a lot of attention from scientific infrastructure projects and communities that aim to employ them for management of massive amounts of research data and metadata objects. Such usage scenarios, however, require additional facilities to enable automated data management with PIDs. In this article, we present a conceptual framework that is based on the idea of using common abstract data types (ADTs) in combination with PIDs. This provides a well-defined interface layer that abstracts from both underlying PID systems and higher-level applications. Our practical implementation is based on the Handle System, yet the fundamental concept of PID-based ADTs is transferable to other infrastructures, and it is well suited to achieve interoperability between them.
Examining the scientific process in relation to endangered data, data reuse, and sharing is crucial in facilitating scientific workflow. Deterioration, format obsolescence, and insufficient metadata for discovery are significant problems leading to loss of scientific data. The research presented in this paper considers these potentially lost data. Four one-hour focus groups and a demographic survey were conducted with 14 scientists to learn about their attitudes toward endangered data, data sharing, data reuse, and their opinions of the DARI inventory. The results indicate that unavailability, lack of context, accessibility issues, and potential endangerment are key concerns to scientists.
The International Council for Science (ICSU) vision explicitly recognises the value of data and information to science and particularly emphasises the urgent requirement for universal and equitable access to high quality scientific data and information. A universal public domain for scientific data and information will be transformative for both science and society. Over the last several years, two ad-hoc ICSU committees, the Strategic Committee on Information and Data (SCID) and the Strategic Coordinating Committee on Information and Data (SCCID), produced key reports that make 5 and 14 recommendations respectively aimed at improving universal and equitable access to data and information for science and providing direction for key international scientific bodies, such as the Committee on Data for Science and Technology (CODATA) as well as a newly ratified (by ICSU in 2008) formation of the World Data System. This contribution outlines the framing context for both committees based on the changed world scene for scientific data conduct in the 21st century. We include details on the relevant recommendations and important consequences for the worldwide community of data providers and consumers, ultimately leading to a conclusion, and avenues for advancement that must be carried to the many thousands of data scientists world-wide.
The Global Observing Systems Information Center (GOSIC), which was initiated in 1997 at the request of the Global Climate Observing System (GCOS) Steering Committee, responds to a need identified by the global climate observing community for easier and more effective access to observational climate data and information. GOSIC manages an online portal providing an entry point for users of climate-related global observing systems data and information systems and also helps serve the needs of the World Data Center for Meteorology, Asheville. The GOSIC continues to evolve and expand its responsibilities, and this paper is an update to a similar paper (Diamond & Lief, 2009) that was presented at the 1st ICSU World Data System Conference in Kyoto, Japan in September 2011. Since 2009, there have been considerable updates made to the GOSIC portal that will be discussed in the paper.
Establishment of the Russian-Ukrainian WDS Segment and its state of the art, main priorities and research activities are described. One of the high priority tasks for Segment members is development of a common information space - transition from Legacy Systems and individual services to a common, globally interoperable, distributed data system that incorporates emerging technologies and new scientific data activities. The new system will build on the potential and added value offered by advanced interconnections between data management and data processing components for disciplinary and multidisciplinary applications. Thus, the principles of the architectural organization of intelligent data processing systems are discussed in this paper.
Geomagnetic indices are basic data in Solar-Terrestrial physics and in operational Space Weather activities. The International Service of Geomagnetic Indices (ISGI) is in charge of the derivation and dissemination of the geomagnetic indices that are acknowledged by the International Association of Geomagnetism and Aeronomy (IAGA, an IUGG association). Institutes that are not part of ISGI started early in the Internet age to circulate on-line preliminary values of geomagnetic indices. In the absence of quality stamping, this resulted in a very confusing situation. The ISGI label was found to be the simplest and the safest way to insure quality stamping of circulated geomagnetic indices.
International attention to scientific data continues to grow. Opportunities emerge to re-visit long-standing approaches to managing data and to critically examine new capabilities. We describe the cognitive importance of metaphor. We describe several metaphors for managing, sharing, and stewarding data and examine their strengths and weaknesses. We particularly question the applicability of a “publication” approach to making data broadly available. Our preliminary conclusions are that no one metaphor satisfies enough key data system attributes and that multiple metaphors need to co-exist in support of a healthy data ecosystem. We close with proposed research questions and a call for continued discussion.
The British Geological Survey has operated a World Data Centre for Geomagnetism since 1966. Geomagnetic time-series data from around 280 observatories worldwide at a number of time resolutions are held along with various magnetic survey, model, and activity index data. The operation of this data centre provides a valuable resource for the geomagnetic research community.
The operation of the WDC and details of the range of data held are presented. The quality control procedures that are applied to incoming data are described as is the work to collaborate with other data centres to distribute and improve the overall consistency of data held worldwide. The development of standards for metadata associated with datasets is demonstrated, and current efforts to digitally preserve the BGS analogue holdings of magnetograms and observatory yearbooks are described.
In this report we introduce the development of the WDC for Geophysics, Beijing included our activities in the electronic Geophysical Year (eGY) and in the transition period from WDC to WDS. We also present our future plans. We have engaged in the development of geophysical informatics and related data science. We began the data visualization of geomagnetic fields in the GIS system. Our database has been expanded from geomagnetic data to the data of solid geophysics, including geothermal data, gravity data, and the records of aurora sightings in ancient China. We also joined the study of the history of the development of geophysics in China organized by the Chinese Geophysical Society (CGS).
The Centre de Donnees astronomiques de Strasbourg (CDS), created in 1972, has been a pioneer in the dissemination of digital scientific data. Ensuring sustainability for several decades has been a major issue because science and technology evolve continuously and the data flow increases endlessly. The paper briefly describes CDS activities, major services, and its R&D strategy to take advantage of new technologies. The next frontiers for CDS are the new Web 2.0/3.0 paradigm and, at a more general level, global interoperability of astronomical on-line resources in the Virtual Observatory framework.
The integration of Taiwan's biodiversity databases started in 2001, the same year that Taiwan joined GBIF as an associate participant. Taiwan, hence, embarked on a decade of integrating biodiversity data. Under the support of NSC and COA, the database and websites of TaiBIF, TaiBNET (TaiCOL), TaiBOL, and TaiEOL have been established separately and collaborate with the GBIF, COL, BOL, and EOL respectively. A cross-agency committee was thus established in Academia Sinica in 2008 to formulate policies on data collection and integration as well as the mechanism to make data available to the public. Any commissioned project will hereafter be asked to include these policy requirements in its contract. So far, TaiBIF has gained recognition in Taiwan and abroad for its efforts over the past several years. It can provide its experience and insights for others to reference or replicate.
ISRIC - World Soil Information has a mandate to serve the international community as custodian of global soil information and to increase awareness and understanding of the role of soils in major global issues. To adapt to the current demand for soil information, ISRIC is updating its enterprise data management system, including procedures for registering acquired data, such as lineage, versioning, quality assessment, and control. Data can be submitted, queried, and analysed using a growing range of web-based services - ultimately aiming at full and open exchange of data, metadata, and products - through the ICSU-accredited World Data Centre for Soils.
Increasing demand within the geomagnetism community for high quality real-time or near-real-time observatory data means there is a requirement for data producers to have a robust and scalable data processing infrastructure capable of delivering geomagnetic data products over the Internet in a variety of formats. We describe a new software system, developed at BGS, which will allow access to our geomagnetic data products both within our organisation's intranet and over the Internet. We demonstrate how the system is designed to afford easy access to the data by a wide range of software clients and allow rapid development of software utilizing our observatory data.
The International VLBI Service for Geodesy and Astrometry (IVS) is a globally operating service that coordinates and performs Very Long Baseline Interferometry (VLBI) activities through its constituent components. The VLBI activities are associated with the creation, provision, dissemination, and archiving of relevant VLBI data and products. The data and products are stored in dedicated IVS components called ‘Data Centers.’ The three Primary Data Centers provide identical data holdings. We give a brief overview of the organizational structure of the IVS and describe the general data flow among the various IVS components from preparing observational plans to creating the final products.
The World Data Centre for Geomagnetism, Mumbai has functioned as a division of the Indian Institute of Geomagnetism, Navi Mumbai since its full fledged activities commenced in 1991 in coordination with the International Council of Scientific Unions (ICSU) Panel on World Data Centres. Responsibility for the compilation of final hourly absolute values from nine of the Indian magnetic observatories and deposition of this data to the World Data Centres is undertaken at the centre. We have utilized the full advantage of technology advancement in upgrading our data preservation and conservation policy at various levels. In recent years, the centre has prioritized its activities related to digital preservation to ensure digital archiving of magnetic data from the traditional media and also digital conservation of very old hand written/printed data volumes and magnetograms. In view of the scientific importance of data from the Colaba-Alibag Magnetic Observatory, old magnetograms and data volumes are being converted to digital images for long term preservation. In the digital preservation process, the creation of metadata has become an important component in storing information related to old and current scientific records for future use. The centre also hosts a database driven website to make datasets available online to the global scientific community.
The Japan Oceanographic Data Center has been submitting oceanographic data to the World Data Center for Oceanography through the framework of the International Oceanographic Data and Information Exchange Committee sponsored by the UNESCO/IOC. In the World Ocean Database 2009, which is the compiled database of the WDC for Oceanography, the Japanese contribution has reached about 16% of the total. Japan is one of the main data suppliers for the WDC for Oceanography. JODC would like to continue to contribute to the World Data System with the WDC.
In this paper, we introduce data and information activities of the International Center for Space Weather Science and Education (ICSWSE), Kyushu University, Japan. The principal data source is the MAGDAS (MAGnetic Data Acquisition System) project, which is a global network of geomagnetic observations operated by collaborations between ICSWSE and institutions in many countries. We operate 66 stations including more than 30 stations distributed along the 210o magnetic meridian and more than 10 stations along the magnetic equator. We have established a semi-automatic data acquisition system via the Internet. Provisional data plots and geomagnetic indices derived from the project are available to the scientific community.
The Russian World Data Center for Solar-Terrestrial Physics and the World Data Center for Solid Earth Physics have been collecting, analyzing, archiving, and disseminating data and information on a wide range of geophysical disciplines since the International Geophysical Year 1957-1958. The centers provide free and convenient access for users to their large and permanently increasing volumes of data. Russian WDCs participate in scientific national and international programs and projects, such as InterMAGNET, InterMARGINS, and the International Polar Year. Since 2008 there has been an association of five Russian WDCs and one Ukrainian WDC in a regional segment of the World Data Centers.
Ptplot is a set of two dimensional signal plotters components written in Java with multiple properties, such as being embeddable in applets or applications, utilizing automatic or manual tick marks, logarithmic axes, infinite zooming, and much more. The World Data Centre of IPS applies Ptplot as a multiple function online data plot tool by converting various text format data files into Ptplot recognizable XML files with the AWK language. At present, Ptplot has allowed eight archived solar-terrestrial science data sets to be easily plotted, viewed, and downloaded from the IPS web site.
This paper summarizes our effort towards managing the multi-disciplinary disaster-related data from the Great East Japan Earthquake, which happened on March 11, 2011 off the coast of Northeast Japan. This earthquake caused the largest tsunami in the recorded history of Japan, killed many people along the coast, and caused a nuclear disaster in Fukushima, which continues to affect a large area of Japan. Just after the earthquake, we started crisis response data management activities to provide useful information for supporting disaster response and recovery. This paper introduces the various types of datasets we made from the viewpoint of data management processing and draws lessons from our post-disaster activities.
We describe our knowledge-based service architecture for multi-risk environmental decision-support, capable of handling geo-distributed heterogeneous real-time data sources. Data sources include tide gauges, buoys, seismic sensors, satellites, earthquake alerts, Web 2.0 feeds to crowd source 'unconventional' measurements, and simulations of Tsunami wave propagation. Our system of systems multi-bus architecture provides a scalable and high performance messaging backbone. We are overcoming semantic interoperability between heterogeneous datasets by using a self-describing 'plug-in' data source approach. As crises develop we can agilely steer the processing server and adapt data fusion and mining algorithm configurations in real-time.
In this paper, a new approach to the detection of anomalies in geophysical records is connected with a fuzzy mathematics application. The theory of discrete mathematical analysis and collection of algorithms for time series processing constructed on its basis represents the results of this research direction. These algorithms are the consequence of fuzzy modeling of the logic of an interpreter who visually recognizes anomalies in records. They allow analyzing large data sets that are not subjected to manual processing. The efficiency of these algorithms is demonstrated in several important geophysical applications. Plans for an extension of the Russian INTERMAGNET segment are presented.
The World Data System (WDS) requires that WDS data centers have significant data holdings and sustainable data sources integration and sharing mechanism. Research data is one of the important science data resources, but it is difficult to be archived and shared. To develop a long term data integration and sharing mechanism, a new approach to data archiving of research data derived from science research projects has been developed in China. In 2008, the host agency of the World Data Center for Renewable Resources and Environment, authorized by the Ministry of Science and Technology of China, began to implement the first pilot experiment for research data archiving. The approach’s data archiving process includes four phases: data plan development, data archiving preparation, data submission, and data sharing and management. In order to make data archiving operate more smoothly, a data archiving environment was established. This includes a uniform core metadata standard, data archiving specifications, a smart metadata register tool, and a web-based data management and sharing platform. During the last 3 years, research data from 49 projects has been collected by the sharing center. The datasets are about 2.26 TB in total size and have attracted over 100 users.
Diverse data accumulated by many science projects make up the most significant legacy of the International Polar Year (IPY 2007-2008). The Polar Data Center (PDC) of the National Institute of Polar Research (NIPR) has a responsibility to manage these data for Japan as a National Antarctic Data Center (NADC) and as the World Data Center (WDC) for Aurora. During the IPY, a significant number of multidisciplinary metadata records were compiled from IPY endorsed projects with Japanese activity. A tight collaboration was established between the Global Change Master Directory (GCMD), the Polar Information Commons (PIC), and the newly established World Data System (WDS).
In this paper we discuss environmental changes along the coastal line of Nigeria, especially in the region around Lagos, based on provisional multi-disciplinary analyses of meteorological and maritime observations. This study has revealed that recent environmental change in the Nigerian coastal region has been much more apparent than that of a few years back (1989-2007). Various kinds of ocean debris, transported mainly by coastal wind, are severely affecting the marine and coastal environment. Because the current ocean monitoring system has been found to be troubled by ocean debris, establishing a new system to obtain reliable observational data to monitor and preserve the environment of the coastal region is urgent.
We present a new concept of analysis using visualization of large quantities of simulation data. The time development of 3D objects with high temporal resolution provides the opportunity for scientific discovery. We visualize large quantities of simulation data using the visualization application 'Virtual Aurora' based on AVS (Advanced Visual Systems) and the parallel distributed processing at "Space Weather Cloud" in NICT based on Gfarm technology. We introduce two results of high temporal resolution visualization: the magnetic flux rope generation process and dayside reconnection using a system of magnetic field line tracing.
It is often discussed that the fourth methodology for science research is "informatics". The first methodology is a theoretic approach, the second one is observation and/or experiment, and the third one is computer simulation. Informatics is a new methodology for data intensive science, which is a new concept based on the fact that most scientific data are digitalized and the amount of data is huge. The facilities to support informatics are cloud systems. Herein we propose a cloud system especially designed for science. The basic concepts, design, resources, implementation, and applications of the NICT science cloud are discussed.
The Space Physics Archive Search and Extract (SPASE) project is an international collaboration among Heliophysics (solar and space physics) groups concerned with data acquisition and archiving. The SPASE group has simplified the search for data through the development of the SPASE Data model as a common method to describe data sets in the archives. The data model is an XML-based schema and is now in operational use. The use is expanding, but there are still other groups who could benefit from adopting SPASE. We discuss the present state of SPASE usage and how we foresee development in the future.
A method for prediction and simulation based on the Cell Based Geographic Information System(GIS) as Cellular Automata (CA) is proposed together with required data systems, in particular metasearch engine usage in an unified way. It is confirmed that the proposed cell based GIS as CA has flexible usage of the attribute information that is attached to the cell in concert with location information and does work for disaster spreading simulation and prediction.
This paper focuses on using data mining technology to efficiently and accurately discover habitats and stopovers of migratory birds. The three methods we used are as follows: 1. a density-based clustering method, detecting stopovers of birds during their migration through density-based clustering of location points; 2. A location histories parser method, detecting areas that have been overstayed by migratory birds during a set time period by setting time and distance thresholds; and 3. A time-parameterized line segment clustering method, clustering directed line segments to analyze shared segments of migratory pathways of different migratory birds and discover the habitats and stopovers of these birds. Finally, we analyzed the migration data of the bar-headed goose in the Qinghai Lake Area through the three above methods and verified the effectiveness of the three methods and, by comparison, identified the scope and context of the use of these three methods respectively.
Environmental monitoring in ecological and hydrological watershed-scale research is an important and promising area of application for wireless sensor networks. This paper presents the system design of the IPv6 wireless sensor network (IPv6WSN) in the Heihe River watershed in the Gansu province of China to assist ecological and hydrological scientists collecting field scientific data in an extremely harsh environment. To solve the challenging problems they face, this paper focuses on the key technologies adopted in our project, metadata modeling for the IPv6WSN. The system design introduced in this paper provides a solid foundation for effective use of a self-developed IPv6 wireless sensor network by ecological and hydrological scientists.
An outline of a planned system for the global space-weather monitoring network of NICT (National Institute of Information and Communications Technology) is given. This system can manage data collection much more easily than our current system by installations of autonomous recovery, periodical state monitoring, and dynamic warning procedures. According to a provisional experiment using a network simulator, the new system will work under limited network conditions, e.g., a 160 msec delay, a 10 % packet loss rate, and a 500 Kbps bandwidth.
An overview of the Inter-university Upper atmosphere Global Observation NETwork (IUGONET) project is presented with a brief description of the products to be developed. This is a Japanese inter-university research program to build the metadata database for ground-based observations of the upper atmosphere. The project also develops the software to analyze the observational data provided by various universities/institutes. These products will be of great help to researchers in efficiently finding, obtaining, and utilizing various data dispersed across the universities/institutes. This is expected to contribute significantly to the promotion of interdisciplinary research, leading to more a comprehensive understanding of the upper atmosphere.
A substantial amount of data is collected through surveys conducted in Africa by national statistics offices, international donor organisations, research institutions, and the private sector. Data management at African national statistics offices is hampered by limited resources. An option for data curation in African countries is the establishment of dedicated institutions for data preservation and dissemination, such as survey data archives, and research data centres. DataFirst, at the University of Cape Town, has established an African data service and is helping to improve African data curation practices through providing data, promoting free curation tools, and undertaking data management training in African countries.
Digital data and service centers, such as those envisaged by the ICSU World Data System (WDS), are subject to a wide-ranging collection of requirements and constraints. Many of these requirements are traditionally difficult to assess and to measure objectively and consistently. As a solution to this problem, an approach based on a maturity model is proposed. This adds significant value not only in respect to objective assessment but also in assisting with evaluation of overlapping and competing criteria, planning of continuous improvement, and progress towards formal evaluation by accreditation authorities.
The World Data Center for Geophysics in Boulder, Colorado is hosted by the National Geophysical Data Center (NGDC). NGDC's vision is to be the world's leading provider of geophysical and environmental data, information, and products. NGDC's mission is to provide long-term scientific data stewardship for geophysical data, ensuring quality, integrity, and accessibility. Faced with ever expanding data volumes and types of data, NGDC is developing more innovative techniques for science data stewardship based in part on data mining and fuzzy logic. Use of these techniques will allow NGDC to more effectively provide data stewardship for its own scientific data archives and perhaps the broader World Data System.
An international data archive is critical for understanding the climate system dynamics of the cryosphere. Currently, no such system exists though data collection and integration efforts are ongoing. The Cryosphere Data Archive Partnership (CrDAP) is developing an open system for storing cryospheric observation data and metadata. First stage data handling in CrDAP focused on integrating point observational and photographic data. The metadata structure of CrDAP was extended based on ISO 19115, which is a geographic information metadata standard of the International Organization for Standardization (ISO).
Since 1997, the Global Geodynamics Project (GGP) stations have used a text-based data format. The main drawback of this type of data coding is the lack of data integrity during the data flow processing. As a result, metadata and even data must be checked by human operators. In this paper, we propose a new format for representing the GGP data. This new format is based on the eXtensible Markup Language (XML).