+ All documents
Home > Documents > Ontology and metadata creation for the Poseidon distributed Coastal Zone Management System

Ontology and metadata creation for the Poseidon distributed Coastal Zone Management System

Date post: 12-Nov-2023
Category:
Upload: independent
View: 1 times
Download: 0 times
Share this document with a friend
10
Ontology and Metadata Creation for the Poseidon Distributed Coastal Zone Management System P.C.H.Wariyapola Massachusetts Institute of Technology Department of Ocean Engineering Design Laboratory Cambridge MA 02139-4307 [email protected] S.L.Abrams Massachusetts Institute of Technology Department of Ocean Engineering Design Laboratory A.R.Robinson Harvard University Physical-Interdisciplinary Ocean Science Group K.Streitlien Massachusetts Institute of Technology Sea Grant College Program Autonomous Underwater Vehicles Laboratory N.M.Patrikalakis Massachusetts Institute of Technology Department of Ocean Engineering Design Laboratory Cambridge MA 02139-4307 [email protected] P.Elisseeff Massachusetts Institute of Technology Department of Ocean Engineering Acoustics Group H.Schmidt Massachusetts Institute of Technology Department of Ocean Engineering Acoustics Group Abstract The objective of the Poseidon Coastal Zone Management System project is to develop a distributed data and software architecture to locate, retrieve, utilize (in simulations and analysis), and visualize information about the coastal ocean environment. The work described in this paper concentrates on the issues related to efficiently identifying and creating the data needed for a given task. Since scientific data sets often do not contain information about the environment in which the data was obtained, we need a method of distinguishing data based on external information, or metadata. This metadata needs to be standardized in order to facilitate searching. While many metadata standards exist, no one of them is adequate for representing all the information needed for coastal zone management. We have therefore implemented a method that incorporates three existing metadata standards in an expandable object-oriented structure known as the Warwick Framework. Furthermore, since metadata is expensive and tedious to produce, we have developed a web-based software tool that simplifies the process and reduces the storage of redundant information. While we can search the metadata for the information we need, we still need a common vocabulary, or ontology, to ensure that we can identify data unambiguously. Since no existing vocabulary encapsulates all aspects of the ocean sciences and ocean systems management, we have facilitated the production of such a resource by creating a web-based tool that will allow specialists with the requisite domain knowledge (e.g., oceanographers, acousticians, coastal zone managers, etc.) to populate the ontology independently. The tool handles all the logistical issues of storage, maintenance, and distribution.
Transcript

Ontology and Metadata Creation for the Poseidon Distributed Coastal ZoneManagement System

P.C.H.WariyapolaMassachusetts Institute of Technology

Department of Ocean EngineeringDesign Laboratory

Cambridge MA [email protected]

S.L.AbramsMassachusetts Institute of Technology

Department of Ocean EngineeringDesign Laboratory

A.R.RobinsonHarvard University

Physical-Interdisciplinary Ocean ScienceGroup

K.StreitlienMassachusetts Institute of Technology

Sea Grant College ProgramAutonomous Underwater Vehicles Laboratory

N.M.PatrikalakisMassachusetts Institute of Technology

Department of Ocean EngineeringDesign Laboratory

Cambridge MA [email protected]

P.ElisseeffMassachusetts Institute of Technology

Department of Ocean EngineeringAcoustics Group

H.SchmidtMassachusetts Institute of Technology

Department of Ocean EngineeringAcoustics Group

Abstract

The objective of the Poseidon Coastal ZoneManagement System project is to develop a distributeddata and software architecture to locate, retrieve, utilize(in simulations and analysis), and visualize informationabout the coastal ocean environment. The work describedin this paper concentrates on the issues related toefficiently identifying and creating the data needed for agiven task. Since scientific data sets often do not containinformation about the environment in which the data wasobtained, we need a method of distinguishing data basedon external information, or metadata. This metadata needsto be standardized in order to facilitate searching. Whilemany metadata standards exist, no one of them isadequate for representing all the information needed forcoastal zone management. We have therefore implemented

a method that incorporates three existing metadatastandards in an expandable object-oriented structureknown as the Warwick Framework. Furthermore, sincemetadata is expensive and tedious to produce, we havedeveloped a web-based software tool that simplifies theprocess and reduces the storage of redundant information.While we can search the metadata for the information weneed, we still need a common vocabulary, or ontology, toensure that we can identify data unambiguously. Since noexisting vocabulary encapsulates all aspects of the oceansciences and ocean systems management, we havefacilitated the production of such a resource by creating aweb-based tool that will allow specialists with therequisite domain knowledge (e.g., oceanographers,acousticians, coastal zone managers, etc.) to populate theontology independently. The tool handles all the logisticalissues of storage, maintenance, and distribution.

1. Introduction

With the advent of new sensors, storage technologies,and widespread access to the Internet, the potential existsfor a new era of ocean science investigation wherescientists, students, resource managers, and governmentofficials have frictionless access to oceanographic data,simulation results, and software. The objective of thePoseidon project is to develop a network-basedarchitecture to locate, retrieve, analyze and visualizedistributed heterogeneous data for scientific explorationand administration of the ocean environment. Poseidonwill include distributed analysis, modeling, simulation andvisualization software, and the capability to graphicallycreate and execute complex workflows constructed fromdistributed data and software resources. (Workflows arerepresented as dependency graphs describing the flow ofdata between various software processing elements, suchas simulation, analysis, visualization, etc.)

The Poseidon architecture assumes that the scientistswho collect or create data will remain responsible for thatdata’s storage, maintenance, and accessibility by some on-

line access mechanism. Similarly, the available softwaretools will be accessed in a service-oriented mode [7],running on remote hosts. The organizations and scientiststhat created these tools retain full ownership and control ofthe software. The Poseidon front-end consists of anexecutive system deployed as a Java applet [20]downloaded to a standard Java-enabled web browser (seeFigure 1). This executive system supports userauthentication, searching for and retrieving distributeddata and software resources, creating workflows, andsupervising their execution.

The Poseidon server interacts with distributed resourcesusing the CORBA (Common Object Request BrokerArchitecture) protocol [16]. CORBA was selected becauseof its wide acceptance and its proven ability to facilitatecross-language, cross-platform integration. All distributedresources in the Poseidon system are encapsulated asCORBA-compliant objects. The creation of CORBAwrappers for legacy systems is performed in collaborationwith the owners of those resources.

Figure 1: Poseidon architecture

documents(text, image,

audio, video) user

CORBAsearch, retrieval, data translation,... authentication, accounting,...

rawmarine

data

dataacquisitionmodalities

encapsulatedobject resources

simulation/analysissoftware

objectwrapper

Metadata

marineontology

metadatatemplates

scientificresults

@@@@@@@@@@@@@@@@@@

@@@@@@@@@@@@@@@@@@@@@@@@

legacy resources

nativedistributed

data search/retrieval and

softwareexecution

Poseidon services

model mgmt. system,metadata creation,

standard visualization,statistical analysis, etc.

www browser

Poseidonuser interface

ORB

ORB

ORB

@@@@@@@@@@@@@@@@@@@@@@@@

ORB

Poseidon server

Javaapplets

resourceregistry

2. Overview of Related Activities

In order to search and locate data relevant to aparticular task, we need an accurate description of thatdata. Many of the resources used in the Poseidon systemare scientific data and simulation results that do notcontain within themselves adequate information foraccurate identification. Thus, we need a method ofcapturing and storing this information (or metadata, ‘‘dataabout data’’) external to the data itself. We conceptualizemetadata as a set of descriptors that uniquely identify aspecific entity, possibly using terminology drawn from anontology (see Section 5). Metadata is necessary to capturethe semantic content of data sets and software programs.An indexed metadata repository can be used to enableefficient resource discovery in the context of aninformation system utilizing distributed software and dataresources.

Perhaps the most familiar use of metadata isbibliographic information (e.g., author, title, publisher,etc.), which has long been used in library and informationretrieval systems [1]. The application of similarbibliographic metadata for web-based resources led to thecreation of the Dublin Core metadata standard [12, 36,37]. The Dublin Core was subsequently extended toinclude the Nordic Classification system as part of theNordic Metadata Project [17]. The DIENST system [22]used by the NCSTRL (Networked Computer ScienceTechnical Reference Library) project [9] also uses similarbibliographic metadata based on RFC 1807 [24]. Theevolving Resource Description Framework (RDF) [25, 28]makes use of the web-based hyperlink capabilities ofXML (Extensible Markup Language) [4] to provide aflexible model for defining the appropriate metadata forresources in a particular problem domain.

To provide a uniform standard for describinggeospatially referenced data, the U.S. governmentestablished the Federal Geographic Data Committee(FGDC) under the aegis of the National Spatial DataInfrastructure (NSDI) initiative [8], resulting in theContent Standard for Digital Geospatial Metadata(CSDGM) [13]. CSDGM defines over 250 individualproperties organized into seven categories: (1)Identification; (2) Data Quality; (3) Spatial DataOrganization; (4) Spatial Reference; (5) Entity andAttribute; (6) Distribution; and (7) Metadata Reference.The CSDGM is used as the basis for interactive web-basedsearch and retrieval for the Defense Modeling andSimulation Office’s Master Environmental (MEL) service[26]. The U.S. Geological Survey’s Biological ResourcesDivision (BRD) has created a Biological Data Profile ofthe CSDGM as part of the National Biological Information

Infrastructure (NBII) Metadata Standard [3]. TheBiological Profile recently completed its public reviewstage.

Additional geophysical-based metadata efforts are alsounderway, including NASA’s Global Change MasterDirectory [15], a MEL-like service that uses the CSDGM-compatible Directory Interchange Format (DIF); theNASA-sponsored EOSDIS project [11] for disseminatingsatellite remote sensing data; and the Unidata project [10]for disseminating meteorological data. Geophysicalmetadata will be central to the Thetis project [18, 19, 29,35] underway at ICS-FORTH, Crete, Greece. The Thetissystem is intended to interconnect distributed collectionsof heterogeneous scientific data repositories, geographicinformation systems, simulation codes, and visualizationtools via the Internet and the WWW. These datarepositories are used for the management of coastal zonesof the Mediterranean Sea.

There are several ongoing research efforts that havedeveloped metadata creation tools for use in datawarehousing and distribution [32]. The XTME metadataeditor, MP metadata parser, and CNS metadata pre-processor [34], developed by the U.S. Geological Survey,are stand-alone tools that can be installed on a variety ofoperating platforms to allow the user to create CSDGMmetadata, translate from unformatted to CSDGM format,and verify metadata conformance to the CSDGM standard.The BIG BIC Metadata Form [27], created by theTexas/Mexico Borderlands Information Center (BIC),allows users to create and submit metadata for theirresources to the BIC data clearinghouse using a HTMLForm-based web interface implementing CSDGM. TheNordic Metadata Creator [17] uses a similar web-basedapproach for creating Dublin Core metadata.

Several other research projects related to the aims ofPoseidon are of interest. The ISAAC Internet Scout project[33] at the University of Wisconsin-Madison hasdeveloped tools for the transparent query of, access to, andvisualization of information in on-line data repositoriesusing bibliographic metadata. The TSIMMIS project [14]at Stanford University is developing a system to accessdistributed data sources. Both the ISAAC and TSIMMISprojects, however, are concerned primarily withdocument-like resources. MEL provides interactive web-based search and retrieval of geospatial data using HTMLForm and Java-based query mechanisms. MEL candisseminate complex numerical data, but it only supportsrelatively simple query and report functions. Poseidon, onthe other hand, aims to discover distributed resources withsufficient accuracy to allow automated pipelining throughcomplex workflows.

3. Poseidon Metadata

While it would be possible to allow each data providerto create the information that he or she feels is relevant tothe data, the use of heterogeneous metadata formatsdecreases the efficiency and effectiveness of searching. Toprevent this, we need a well-defined metadata standardthat can be followed when creating metadata and cansubsequently be used to search for the data accurately andefficiently. On the basis of our analysis of existingmetadata standards, no one of them satisfies the needs ofthe Poseidon system. Poseidon needs to supportdocuments as well as scientific data, while providingsufficient accuracy to allow automated construction ofworkflows, ensuring compatibility and consistencybetween sequential operations. Thus, none of thedocument-oriented standards (Dublin Core, RFC 1807,RDF, etc.) is adequate for our needs. The FGDC standardfor geospatial metadata, while being quite effective inclassifying information relating to physical phenomena, isnot suited to handle the biological aspects of many oceanprocesses and is too cumbersome for documents. TheBiological Data Profile of the CSDGM, by itself, does notprovide sufficient coverage of physical phenomena.

We decided to adopt existing standards to avoidduplicating the extensive work that has already gone intotheir development. We also wanted a format that will bewidely accepted and implemented so that we can use datasets that are available on-line (such as MEL) withoutrequiring any additional work from the producers of thedata. We have therefore selected the FGDC standards forgeospatial and biological metadata. Since these metadatasets are quite extensive, we felt that they are not quiteappropriate for describing documents, which require onlya small subset of the descriptive information. For thesepurposes we will use the Dublin Core metadata standard.

We are implementing these three distinct standardsusing the expandable container structure developed in theWarwick framework [21]. (The Warwick Framework wasdeveloped to provide a mechanism to cope with theproliferation of incompatible metadata standards. Itdefines a hierarchical container model that canencapsulate multiple metadata representations for the sameunderlying data object.) Using this conceptual format wehave developed a metadata container structure for thePoseidon system (see Figure 2).

Within the Poseidon architecture, all resources aretreated as objects. Each data or software object in thesystem has a corresponding metadata object.Conceptually, each metadata object consists of multiplechild objects. Using the Warwick framework, the top-levelchildren are the three metadata standards described above.

This can be recursively extended until each individualmetadata element (i.e., a single field in a metadatastandard) is treated as an object (see Figure 3). Since thedifferences in metadata among various data sets obtainedin one experiment--or a set of related experiments--aregenerally small, this metadata model allows significantreduction in storage space by requiring only the uniquemetadata elements to be stored.

Figure 2: Metadata standard architecture

We have implemented a part of the above schema toverify its feasibility. Our system uses a Unix file systemdirectory structure corresponding to the metadataframework to store the objects. The objects are currentlystored as flat files. Since we are not using an object-oriented database, we have only implemented the top threelevels of the tree: the top-level abstract metadata object,the second level metadata standard-specific object, and thethird level containing all metadata element values. Notethat the Dublin Core has only two levels [12].

The metadata will be stored and accessed using aregional client/server architecture. All metadata will beduplicated at each regional server, allowing for scalability,fault-tolerance, and load-balancing within the system.Individual metadata objects are indexed at the time theyare created; the full metadata collection is indexed dailyusing an automatic indexer. This allows the system toaccess all data as metadata is created, and to stay currentby eliminating obsolete information.

All searching will be performed on the indices. Once amatch has been found, the entire metadata record and itsassociated data file can be accessed. While the full rangeof capabilities of the search tool have not yet been defined,we expect that all the metadata fields will be searchableusing exact text matching and numerical range matching.We are also investigating the use of fuzzy querycapabilities. An extension of the DIENST system [22] maybe used for this purpose.

Metadata container

FGDC Content Standardfor Digital GeospatialMetadata

FGDC Content Standardfor Digital BiologicalMetadata

Dublin Core metadatastandard for documents

Figure 3: Conceptual object-oriented metadata framework

4. Metadata Creation

In order for metadata to be useful, it needs to becomplete and accurate. Since much of the informationneeded for the metadata is only available at data creationor acquisition time, metadata should ideally be created

contemporaneously with the data. The quality of themetadata is enhanced if it is created by an individual witha significant level of knowledge about the data, itsacquisition environment, and process; most likely, this willbe the scientist responsible for the experiment.

Figure 4: Poseidon Metadata Creator design architecture

While the level of knowledge needed to createmetadata is quite high, so too is the volume of metadataneeded to describe a data set accurately. Theserequirements place a very high time and resource cost onmetadata creation. Therefore, we obviously need sometools that can aid in this process. Ideally, we want a

software utility that resides within the data acquisitionsystem and automatically creates metadata for the data.Since this requires interfacing with diverse systems, weare unable to provide a general tool. In the Poseidonproject, we are continuing ongoing collaboration with ourpartners to implement such a solution on a case by case

Metadata

FGDC Geospatial FGDC Biological Dublin Core

Identification Data Quality Spatial Organization

AuthorTitle

:

. . .

Spatial Reference

User Interface

FGDCGeospatialMetadata

form

Submit

Current metadata files(by title and filename)

Metadatatransfer panel

Metadata elements

Perl scriptMetadata

setsGeospatialmetadata

Identification

Data Quality

:

Poseidon Server

Metadata storage :

basis. We are, however, able to create a web-based editorthat can be accessed using a standard web browser andthat can be used to create metadata efficiently andaccurately.

Figure 4 shows the current design and implementationof the Poseidon Metadata Creator. It uses a standardHTML Form to accept user-input metadata, which is thensubmitted to the Poseidon server (see Figure 5). A Perlscript residing on the server parses and stores the metadataas objects on the server. This information is returned to theuser interface automatically using Perl and JavaScriptfunctions. The user can browse existing metadata sets andcopy any appropriate information to create new metadata.Since metadata for a series of data sets produced by asingle user, or during a single experiment, generally onlydiffer slightly, this allows significant time savings.

Currently, existing metadata objects are displayed as ascrollable list. As the system continues to grow in size wepropose to change this to a hierarchical list organized onthe basis of a user defined field (e.g., owner, location,keywords, etc.). We will also add a search feature to allowa user to find metadata sets without browsing the entire

list. In the current implementation, the Metadata Creatoronly supports a partial list of fields from the FGDCGeospatial standard. We are in the process of supportingthe entire three-part metadata architecture described above(see Figure 2).

While a number of tools implement some relativelywell known metadata standard, and simplify the creationof metadata by presenting the user with a template thatmatches that standard (e.g., XTME, BIG BIC, NordicMetadata Creator, etc.), most of them require that the userindividually input all relevant information by hand,although some do have more automated features, such asthe ability to extract a small subset of the information fromthe data itself or the ability of the user to pre-define somefields common for all of metadata, e.g., ESRI’sDocument.aml [2]. In contrast, the Poseidon metadatacreator (which is partially based on the BIG BIC MetadataForm) adds the capability of browsing all metadata filesstored in the system as well as the ability to copy anyinformation already available. This significantly reducesthe cost of developing metadata for multiple data sets withsimilar metadata properties.

Figure 5: Poseidon Metadata Creator user interface

5. Ontology

An ontology is a formal model of the entirety ofabstract characteristics and properties that are applicable toall possible specific instances of entities existing within aproblem domain [5, 6]. We conceptualize an ontology as acontrolled formal vocabulary. While metadata allows us tosearch for data using auxiliary information, in order forthis search to yield accurate results we need to have aconsistent language that can be used by both the creatorsand the users of the metadata. If we can find or create sucha language, it can be implemented as a controlledvocabulary within the metadata creator and the metadatasearch tool to allow consistency in defining and searching.Thus, a common ontology ensures consistency of meaningbetween information providers and informationconsumers.

Our research has not provided us with an ontology forthe marine sciences and coastal zone management that isbroad enough to encapsulate all, or even most, of theinformation found in the data sets that we envision. In thisabsence, we are creating just such a resource. Creating anontology is a time-consuming task that requires a broadrange of domain expertise, and significant administrativeand editorial overhead. Since our research group does notby itself have the resources necessary to complete the task,we have taken a different approach: instead of directlycreating the ontology, we have developed a web-basedtool that can be used to populate a pre-defined vocabularystructure. This tool allows domain specialists to access theontology from distributed sites and to populate the partsrelevant to their specialization. These specialists are notburdened by the need to understand the entire ontology orits administration, and we do not bear the fullresponsibility for the scientific content or editorial process.

We are not aware of any other research projects that havetaken a similar approach to the creation of an ontology.

The Poseidon ontology has been implemented as acollection of objects. Each term, or element, in thevocabulary is treated as an object with the properties listedin Table 1.

The name and definition fields identify and describe theelement, while the reference field is used to provide thesource of this information. Since terms may have differentmeanings in different domains, the concatenation of anelement’s lineage (the sequence of parents starting at thetop level) and its name allows the element to be uniquelyidentified. The last three properties are used to maintain ahistory of the element’s creation and modification.

Table 1: Ontology element properties

Name: {Text}Definition: {Text}Reference: {Text}Date: {Date}Parent: {Text}Definition History: {Text Array}Reference History: {Text Array}Date History: {Date Array}

In our prototype implementation, these objects arestored in flat files in directories corresponding to thevocabulary structure illustrated in Figure 6. The depth ofthis hierarchy is currently limited to three levels to enablea broad structure that can be navigated easily. In futurereleases, we will increase the depth in order toaccommodate growth and enhance visual browsing.

Figure 6: Poseidon Ontology structure

Acoustical Oceanography

Biological Oceanography

Chemical Oceanography

Geological Oceanography

Management

Optical Oceanography

Physical Oceanography

Regulatory

array gain

decibel

hydrophone

:PoseidonOntology

The Poseidon Ontology Creator consists of a Javaclient applet downloadable to any standard Java-enabledweb browser and a Java application daemon running onthe Poseidon server. Communication between these twocomponents is handled via a socket connection (see Figure7).

While the tool allows all users to access and view theexisting ontology, editing functionality is reserved forauthorized users. Once authenticated, these users can usethe Ontology Creator (see Figure 8) to add elements to,and edit elements in, the vocabulary. The tool stores thenew and modified elements on the Poseidon server anddisplays them via the Java user interface. The historicalrecord of modification for each element is maintained,creating accountability and allowing scientists tocollaboratively create and modify the ontology. Onlyauthenticated users with editing privilege can delete anelement.

While we recognize the possibility that the diversecontributors to the ontology may not agree on the meaningof a term, we do not have the requisite domain knowledgeto act as an editorial arbiter, and to do so would contradictthe purpose of the ontology creator tool. Therefore thesedisputes will have to be worked out amongst the domainexperts, using the editorial capabilities of the tool(including the possibility to iteratively develop a definitionand to monitor the history of that development).

Figure 7: Poseidon Ontology Creator architecture

6. Conclusions

We have implemented a web-based software tool tofacilitate the creation of an ontology for the marinesciences and coastal zone management by appropriatedomain specialists. The ontology has a pre-defined, butexpandable multi-level hierarchical structure. It has beenimplemented for 3 levels and partially populated using the

creation tool. We have also developed a conceptualframework for the metadata used by the Poseidondistributed coastal zone management system. Thisincludes provision for three distinct metadata standardscombined using the Warwick Framework, theirimplementation using object-oriented concepts, and ascalable distributed architecture for metadata creation,storage, and searching. We have developed a metadatacreation tool that allows distributed users to produce apartial metadata set using a web front-end. We arecurrently completing the implementation of the metadatacreator by adding the metadata fields omitted in the firstiteration and allowing hierarchical listing and browsing ofthe existing metadata files. We will also add a searchmechanism to allow users to find metadata directly on thebasis of keyword or value matching, without the necessityof manual browsing.

Figure 8: Poseidon Ontology Creator user interface

The metadata tool is most relevant to the creation ofmetadata for small sets of experimental data or for legacydata sets. It may not be appropriate for metadata for datasets produced by real-time automated sensors. Rather than

Socket

Web browser window

Java applet

Socket

Ontology browser and editor

Java applicationdaemon

OntologyStorage

use an interactive system, the metadata for such datashould be produced in as automated a fashion as ispossible by the data acquisition system itself.

To date, we have focused our efforts on metadata fordata. Construction of complex oceanographic workflowswill require resource discovery of both data and software.The problem of appropriate metadata for programsremains an open research topic [30, 31]. One promisingapproach is the use of semantic networks and a rule-basedinference engine proposed by Thetis project researchers[23].

Acknowledgements

We appreciate useful discussions with C.Houstis,C.N.Nikolaou, S.Lalis, M.Marazakis, and A.Sidiropoulosof the University of Crete, Department of ComputerScience, and ICS-FORTH, Crete, Greece, concerningPoseidon and the related Thetis project at their Institute.Funding for this work was obtained in part from the U.S.Department of Commerce (NOAA, Sea Grant) under grantNA86RG0074, the U.S. National Ocean PartnershipProgram via ONR grant N00014-97-1-1018, and the MITDepartment of Ocean Engineering. NATO provided travelfunds for exchanges between the Poseidon and Thetisgroups under grant number CRG971523.

References

[1] American Library Association, ALCTS/LITA/RUSA,Machine-Readable Bibliographic Information Committee. TheUSMARC formats: Background and principles. Library ofCongress, Network Development and MARC Standards Office,Washington, DC, November 1996.<http://lcweb.loc.gov/marc/96principal.html>

[2] ARC/INFO Document.aml. ESRI Software, Redlands, CA,July 14, 1998.<http://www.esri.com/software/arcinfo/docaml721.html>

[3] Biological data profile of the content standard for digitalgeospatial metadata. Biological Data Working Group, FederalGeographic Data Committee and USGS Biological Re- sourcesDivision, July 1998.<http://www.nbii.gov/standards/biodata/bioprofi.pdf>

[4] T. Bray and C. M. Sperberg-McQueen. Extensible markuplanguage (XML): Part I. syntax. W3C Workding Draft 31-Mar-97, World Wide Web Consortium, March 1997.<http://www.w3.org/TR/WD-xml-lang-970331.html>

[5] M. Bunge. Treatise on Basic Philosophy: Vol. 3: Ontology I:The Furniture of the World. Reidel, Boston, 1977.

[6] M. Bunge. Treatise on Basic Philosophy: Vol. 3: Ontology II:A World of Systems. Reidel, Boston, 1979.

[7] H. Casanova, J. J. Dongarra, and K. Moore. Network-enabledsolvers and the NetSolve project. SIAM News, 31(1), January1998.

[8] Coordinating geographic data acquisition and access: TheNational Spatial Data Infrastructure. Executive Order 12906,National Spatial Data Infrastructure, April 11, 1994.

[9] J. R. Davis. Creating a networked computer science technicalreport library. D-Lib Magazine, September 1995.<http://www.dlib.org/dlib/september95/09davis.html>

[10] R. C. Dengel and J. T. Young. The Unidata recoverysystem. In Proceedings, Ninth International Conference onInteractive Information and Processing Systems for Mete-orology, Hydrology, and Oceanography, Anaheim, CA, January1993.

[11] T. Dopplick. The role of metadata in EOSDIS. TechnicalReport 160-TP-013- 001, Hughes Information TechnologySystems, Upper Marlboro, MD, March 1997.<http://edhs1.gsfc.nasa.gov/waisdata/sdp/pdf/tp1601301.pdf>

[12] Dublin Core metadata element set: Resource page,November 2, 1997. <http://purl.org/ metadata/dublin_core/>

[13] Federal Geographic Metadata Committee. Content standardsfor digital geospa- tial metadata, June 4, 1994.<http://geology.usgs.gov/tools/metadata/standard/metadata.html>

[14] H. Garcia-Molina, J. Hammer, K. Ireland, Y.Papakonstantinou, J. Ullman, and J. Widom. Integrating andaccessing heterogeneous information sources in TSIMMIS. InProceedings of the AAAI Symposium on Information Gathering,pages 61-64, March 1995.<ftp://db.stanford.edu/pub/papers/tsimmis-abstract.aaai.ps>

[15] Global change master directory. NASA.<http://gcmd.gsfc.nasa.gov/>

[16] Object Management Group. OMG home page, June 12,1998. <http://www.omg.org/>

[17] J. Hakala. The Nordic metadata project: Final report, July1998. <http:// linnea.helsinki.fi/meta/nmfinal.htm>

[18] C. Houstis, C. Nikolaou, S. Lalis, S. Kapidakis, V.Christophides, E. Simon, and A. Thomasic. Towards a nextgeneration of open scientific data repositories and services. CWIQuarterly (Centrum voor Wiskunde en Informatica), 12(2), 1999.To appear.

[19] C. Houstis, C. Nikolaou, M. Marazakis, N. M. Patrikalakis,J. Sairamesh, and A. Thoma- sic. THETIS: Design of a datamanagement and data visualization system for coastal zonemanagement of the Mediterranean sea. D-lib Magazine,November 1997.<http://www.dlib.org/dlib/november97/thetis/11thetis.html>

[20] Java technology home page, June 16, 1998.<http://www.javasoft.com/>

[21] C. Lagoze. The Warwick Framework: A containerarchitecture for aggregating sets of metadata. D-Lib Magazine,July/August 1996.<http://www.dlib.org/dlib/july96/lagoze/07lagoze.html>

[22] C. Lagoze and J. Davis. Dienst: an architecture fordistributed document libraries. Communications of the ACM,38(4):45, April 1995.

[23] S. Lalis, C. Houstis, and V. Christophides. Exploringknowledge for the interactive specification of scientificworkflows. Technical Report FORTH-ICS/TR-229, Institute ofComputer Science, Foundatation for Research and TechnologyHellas, Heraklion, Crete, Greece, September 1998.

[24] R. Lasher and D. Cohen. A format for bibliographic records.RFC 1807, June 1995.<file://nic.merit.edu/documents/rfc/rfc1807.txt>

[25] O. Lassila. Introduction to RDF metadata. W3C Note 1997-11-13, Cambridge, MA, November 1997.<http://www.w3.org/TR/NOTE-rdf-simple-intro-971113.html>

[26] Master Environmental Library (MEL), technical referenceguide, characteristics and performance, version 1.0. DefenseModeling and Simulation Office, Washington, DC, February 13,1998. <http://www-mel.nrlmry.navy.mil/docs/trg04f.html>

[27] Metadata form (the easy way). Texas/Mexico BorderlandsData and Information Center, Austin, TX, October 17, 1997.<http://www.bic.state.tx.us/BICenglish/ explanation.htm>

[28] E. Miller. An introduction to the Resource DescriptionFramework. D-Lib Magazine, May 1998.<http://www.dlib.org/dlib/may98/miller/05miller.html>

[29] C. Nikolaou, C. Houstis, J. Sairamesh, and N. M.Patrikalakis. Impact of scientific advanced networks for thetransfer of knowledge and technology in the field of coastalzones. In Euro-Mediterranean Workshop on Coastal ZoneManagement, Alexandria, Egypt, November 1996, 1996.

[30] N. M. Patrikalakis, editor. Proceedings of the NSFWorkshop on Distributed Information, Computation and ProcessManagement for Scientific and Engineer- ing Environments(DICPM), Herdon, Virginia, May 15-16, 1998, November 1998.<http://deslab.mit.edu/DesignLab/dicpm/>

[31] N. M. Patrikalakis, P. J. Fortier, Y. Ioannidis, C. N.Nikolaou, A. R. Robinson, J. R. Rossignac, A. Vinacua, and S. L.Abrams. Distributed information and computation for scientificand engineering environments. Design Laboratory Memorandum98-7, Cambridge, MA, December 1998.<http://deslab.mit.edu/DesignLab/dicpm/paper.ps>

[32] H. Phillips. Metadata tools for geospatial data, October 24,1998.<http://badger.state.wi.us/agencies/wlib/sco/metatool/mtools.htm>

[33] M. Roszkowski and C. Lukas. A distributed architecture forresource discovery using metadata. D-Lib Magazine, June 1998.<http://www.dlib.org/dlib/june98/scout/06roszkowski.html>

[34] P. Schweitzer. Format metadata - info and tools. UnitedStates Geological Survey, November 6, 1998.<http://geology.usgs.gov/tools/metadata/>

[35] THETIS: A data management and data visualization systemfor supporting coastal zone management for the MediterraneanSea. Institute of Computer Science, Foundation for Research andTechnology - Hellas, Heraklion, Crete, Greece.<http://www.ics.forth.gr/ pleiades/THETIS/thetis.html>

[36] S. Weibel. Metadata: The foundation of resourcedescription. D-Lib Magazine, July 1995.<http://www.dlib.org/dlib/July95/07weibel.html>

[37] S. Weibel and J. Hakala. DC-5: The Helsinki metadataworkshop: A report on the workshop and subsequentdevelopments. D-Lib Magazine, February 1998.<http://www.dlib.org/dlib/february98/02weibel.html>


Recommended