We describe the system architecture and data template design for the Animal Diversity Web (http://www.animaldiversity.org), an online natural history resource serving three audiences: 1) the scientific community, 2) educators and learners, and 3) the general public. Our architecture supports highly scalable, flexible resource building by combining relational and object-oriented databases. Content resources are managed separately from identifiers that relate and display them. Websites targeting different audiences from the same database handle large volumes of traffic. Content contribution and legacy data are robust to changes in data models. XML and OWL versions of our data template set the stage for making ADW data accessible to other systems.
The Digital Object Identifier (DOI) is a system for identifying content objects in the digital environment. DOIs are names assigned to any entity for use on Internet digital networks. Scientific data sets may be identified by DOIs, and several efforts are now underway in this area. This paper outlines the underlying architecture of the DOI system, and two such efforts which are applying DOIs to content objects of scientific data.
This paper summarizes the vital importance to the scientific community of rescuing historic scientific data, presently in various informal, non-digital formats, from likely oblivion and making them accessible digitally for trend analyses. It proposes models whereby historic biodiversity and astronomical data can be recovered as Canadian initiatives, in the hope of stimulating further discussion of such simple yet essential rescue missions in the context of Canadian scientific research.
Today many countries have applied the strategy of developing an information-oriented society and data infrastructure. Although varying it their details and means of realization, all these policies have the same aim - to build a global information society. Here in Russia this crucial role belongs to the Electronic (Digital) Earth initiative, which integrates geoinformation technologies in the Earth Knowledge Base (EKB). It i designed to promote the economic, social and scientific progress. An analysis of the problem has been done in the article.
Defining a "designated user community" for a data collection is essential to good scientific data stewardship. It enables data managers to determine what information is necessary to ensure the usability of the data now and into the future. It helps managers present and enable access to the data and may determine the format of the data. However, defining a community is difficult, and it is impossible to predict how the use of a data collection may change over time. This creates a series of data management problems for data stewards that may be mitigated by a set of best practices.
The scientific method encourages sharing data with other researchers to independently verify conclusions. Currently, technical barriers impede such public scrutiny. A strategy for offering scientific data for public analysis is described. With this strategy, effectively no requirements of software installation (other than a web browser) or data manipulation are imposed on other researchers to prepare for perusing the scientific data. A prototype showcasing this strategy is described.
"From Paper to Virtual Map" is an innovative technology for creating 3D (three-dimensional) maps. The technology proposed as a very cheap and easy way to create 3D maps. A powefulr graphic station is not necessary for this aim. This is very important for countries such as Bulgaria where is not easy to get expensive computer equipment.
This technology, proposed by the author was developed from a novel application - a 3D cartographic symbol system. The 3D city maps created consist of 3D geometry, topographic information and photo-realistic texturing and 3D symbols, which contain quantitative and qualitative information about the objects. Animated presentations are also available according to users' needs.
When several transactions execute concurrently in a database, the isolation property may no longer be preserved. It is necessary for the system to control the interaction among the concurrent transactions. This paper presents a new locking model for concurrency control in object oriented database systems. This model is motivated by a desire to provide high concurrency and low locking overheads in accessing objects. The proposed model consists of a rich set of lock modes, hash table, lock-based signatures and B+ trees. The performance study result shows that the proposed model performs well for all possible operations on objects.
A confluence of technologies is leading towards revolutionary new interactions between robust data sets, state-of-the-art models and simulations, high-data-rate sensors, and high-performance computing. Data and data systems are central to these new developments in various forms of eScience or grid systems. Space science missions are developing multi-spacecraft, distributed, communications- and computation-intensive, adaptive mission architectures that will further add to the data avalanche. Fortunately, Knowledge Discovery in Database (KDD) tools are rapidly expanding to meet the need for more efficient information extraction and knowledge generation in this data-intensive environment. Concurrently, scientific data management is being augmented by content-based metadata and semantic services. Archiving, eScience and KDD all require a solid foundation in interoperability and systems architecture. These concepts are illustrated through examples of space science data preservation, archiving, and access, including application of the ISO-standard Open Archive Information System (OAIS) architecture.
In this paper the development of a new internet information system for analyzing and classifying melanocytic dat, is briefly described. This system also has some teaching functions, improves the analysis of datasets based on calculating the values of the TDS (Total Dermatoscopy Score) (Braun-Falco, Stolz, Bilek, Merkle, & Landthaler, 1990; Hippe, Bajcar, Blajdo, Grzymala-Busse, Grzymala-Busse, & Knap, et al., 2003) parameter. Calculations are based on two methods: the classical ABCD formula (Braun-Falco et al., 1990) and the optimized ABCD formula (Alvarez, Bajcar, Brown, Grzymala-Busse, & Hippe, 2003). A third method of classification is devoted to quasi-optimal decision trees (Quinlan, 1993). The developed internet-based tool enables users to make an early, non-invasive diagnosis of melanocytic lesions. This is possible using a built-in set of instructions that animates the diagnosis of the four basic lesions types: benign nevus, blue nevus, suspicious nevus and melanoma malignant. This system is available on the Internet website: http://www.wsiz.rzeszow.pl/ksesi.
In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADN-Viewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration.
The viewpoint concept has received widespread attention recently. Its integration into a data model improves the flexibility of the conventional object-oriented data model and allows one to improve the modelling power of objects. The viewpoint paradigm can be used as a means of providing multiple descriptions of an object and as a means of mastering the complexity of current database systems enabling them to be developed in a distributed manner. The contribution of this paper is twofold: to define an object data model integrating viewpoints in databases and to present a federated database system integrating multiple sources following a local-as-extended-view approach.
This article presents the InterPARES Project, a multidisciplinary international research initiative aimed at developing the theoretical and methodological knowledge necessary for the long-term preservation of digital entities produced in the course of business or research activity so that their authenticity can be presumed or verified. The methodology, research activities, preliminary findings and projected products are discussed in the context of the issues that the project attempts to address.
In 1998 ENEA, the Italian National Agency for New Technologies, Energy and the Environment, launched an e-learning platform with the mission of sharing scientific knowledge among everyone, not just workers but also students and the unemployed, in order to use its research results to support competitiveness and sustainable development. In 6 years, more than 20.000 users have followed one or more of the 46 on line courses. Many agreements with schools, universities, private and public training organisation are now under way to improve the dissemination of scientific knowledge and to build an open data base of scientific learning objects that anyone can use.
In Inertial Navigation Systems (INS), the attitude estimated from gyro measurements by the Kalman filter is subject to an unbound error growth during the stand-alone mode, especially for land vehicle applications using low-cost sensors. To improve the attitude estimation of a land vehicle, this paper applies a fuzzy expert system to assist in multi-sensor data fusion from MEMS accelerometers, MEMS gyroscopes and a digital compass based on their complementary motion detection characteristics. Field test results have shown that drift-free and smooth attitude estimation can be achieved and will lead to a significant performance improvement for velocity and position estimation.
Agreeing on a method to identify interfaces is a desirable step in order to build a database of interfacial energy. So far, the expression Sigma X Interface has been used to identify interfaces. Unfortunately, this conventional method cannot express arbitrary interface geometries. In the present work, we review interface geometries and propose a systematic Orientation Method based purely on geometrical considerations that can express arbitrary twists and slipping of interfaces.
One of the JRC Petten tasks is to support European Research and Development (R&D) projects in the material and energy related areas with the management and dissemination of research results. By using XML technology test data can be entered directly from the test machines into the Web-enabled 'Materials Database' (Mat-DB), which is an integral part of the On-line Data Information Network (ODIN On-line Data Information Network, 2004) that has been developed at JRC Petten. Test data, which are kept in XML format and sent by R&D project partners via the World Wide Web to the Petten Server are stored within the Mat-DB XML module. There they can be checked and updated on-line before they are uploaded into the database. After validation by the source administrator they can immediately be retrieved and evaluated by all project partners. A pilot test with the new XML related data exchange module from test machine into Mat-DB has currently been started within the European TMF Standard R&D project.
The SEDO project develops a flexible and reusable platform combining fast access, user freedom, and coherence of the results for presenting socio-economic data. Its first aim is to deliver on the Net the results of longitudinal surveys about the life in Luxemburg. Several search methods are available: hierarchical browsing, engine query, and top down navigation with minimal clicks for quick access to the main trends. Without the use of statistical tools nor expertise in the domain the user can perform advanced statistical calculations. Last, a modular architecture guarantees the portability of the application.
Data sharing poses complex ethical questions for data management. Manifold conflicting and shifting values need to be reconciled in pursuing viable data-management policies. For example, how does one make data available in useable form to stakeholders including scientists, governments and businesses worldwide, while assuring confidentiality, satisfying one's research ethics committee, protecting intellectual property and national security, and containing costs? Increasingly, ethical problem solving requires integration of ethics with technological "know how" and empirical research on the presenting problem. Each problem is highly contextual; broad application of general ethical principles such as always practice openness, or prepare all data for sharing, may have harmful unintended consequences. Chaos theory provides a heuristic or vision for understanding and coping with complexity and uncertainty. It does not provide answers to problems of data management, but frames the issues, and provides appropriate expectations and heuristics for considering data management problems.
Successful resource discovery across heterogeneous repositories is highly dependent on the semantic and syntactic homogeneity of the associated resource descriptions in each repository. Ideally, consistent resource descriptions are easily extracted from each repository, expressed using standard syntactic and semantic structures, and managed and accessed within a distributed, flexible, and scalable software framework. In practice however, seldom do all three of these elements exist. To help address this situation, the Object Oriented Data Technology (OODT) project at the Jet Propulsion Laboratory has developed an extensible, standards-based resource description scheme that provides the necessary description and management facilities for the discovery of resources across heterogeneous repositories. The OODT resource description scheme can be used across scientific domains to describe any resource. It uses a small set of generally accepted, broadly-scoped descriptors while also providing a mechanism for the inclusion of domain-specific descriptors. In addition, the OODT scheme can be used to capture hierarchical, relational and recursive relationships between resources. In this paper we expand on prior work and describe an intelligent resource discovery framework that consists of separate software and data architectures focusing on the standard resource description scheme. We illustrate intelligent resource discovery using a case study that provides efficient search across distributed repositories using common interfaces and a hierarchy of resource descriptions derived from a complex, domain-specific ontology.
Genomic information shows some characteristics that make them very difficult to interpret and to exploit. Such data constitute an important factual resource (GenBank, SwissProt, GeneOntology, or Decrypthon...), are heterogeneous, huge in quantity, and geographically distributed. This paper presents Genome3DExplorer, a new modeling and software solution to visualize textual and factual genomic data based on adapted federator description language. The exploration is based on a well-adapted graphical paradigm that automatically helps to build a graph-based representation, and allows biologist to highlight some global topological characteristics of data, which are uneasily visible using traditional exploration tools. Finally, we present results produced by Genome3DExplorer software on various sets of biological data.
Experimental studies of surface tension and density by the maximum bubble pressure method and dilatometric technique were undertaken and the accumulated data for liquid pure components, binary, ternary and multicomponent alloys were used to create the SURDAT data base for Pb-free soldering materials. The data base enabled, also to compare the experimental results with those obtained by the Butler's model and with the existing literature data. This comparison has been extended by including the experimental data of Sn-Ag-Cu-Sb alloys.
Patents are a very useful source of technical information. The public availability of patents over the Internet, with for some databases (eg. Espacenet) the assurance of a constant format, allows the development of high value added products using this information source and provides an easy way to analyze patent information. This simple and powerful tool facilitates the use of patents in academic research, in SMEs and in developing countries providing a way to use patents as a ideas resource thus improving technological innovation.
This paper deals with a description of activities in the Czech Republic related to digital archiving. First of all the general situation in the field is described in order to give insight in the state of art in the field in the Czech Republic. The key part of this paper deals with a description of the design and implementation of a pilot system that should serve for digital archiving of scientific information of certain kind - MSc and PhD theses at Czech Technical University in Prague. One of the reasons for archiving of this type of information was the fact that these theses contain information about scientific and technological developments in a given period of time. Such information might be widely appreciated in future by historians who will investigate the history of science and technology of a certain period of time. The research is oriented towards robust archiving systems that can be used in small-scale applications. These small systems do not offer universal solutions in the field of digital archiving - they solve problems that become urgent in various applications: to save current digital documents in the form that could be transferable to general archiving systems developed later. The described implementation is a pilot practical solution to this problem. The approach described in the paper will allow the user to archive also documents that contain non-textual information.
Huge volumes of bioinformatics data are emerging from sequencing efforts, gene expression assays, X-ray crystallography of proteins, and many other activities. High-throughput experimental methods produce masses of data, so that the whole of biology has changed from a data-light science into a data-driven science. Currently there are a lot of databases and software tools dealing with these genomic data. In general, each tool and database uses a different type of data in exchange protocols, and usually they offer specific services. These Databases are design with different languages and run on different operating systems. Therefore biologists are in a difficult situation where they have to use, process and store heterogeneous data when using heterogeneous software tools and databases. Our framework, GenoMEDIA provides two main middleware to help for this integration, Lydia and Antje. On the one hand, the Lydia middleware offers us facilities for working simultaneously with a variety of Services and Databases. On the other hand, the Antje one ,with the concept of remote view, is designed to allow users to manage multiple heterogeneous remote databases in a uniform vision. The aim of this paper is to present GenoMEDIA and how heterogeneous databases and remote services are integrated, in particular how Antje was designed, implemented and tested with various genomic databases.
Government agencies and other organizations are required to manage and preserve records that they create and use to facilitate future access and reuse. The increasing use of geospatial data and related electronic records presents new challenges for these organizations, which have relied on traditional practices for managing and preserving records in printed form. This article reports on an investigation of current and future needs for managing and preserving geospatial electronic records on the part of local- and state-level organizations in the New York City metropolitan region. It introduces the study and describes organizational needs observed, including needs for organizational coordination and inter-organizational cooperation throughout the entire data lifecycle.