Difference between revisions of "Metadata standards"
(→Dublin Core and Business metadata) |
(→ISO standards) |
||
(21 intermediate revisions by 2 users not shown) | |||
Line 10: | Line 10: | ||
<br/> | <br/> | ||
See more at http://dublincore.org | See more at http://dublincore.org | ||
+ | <br/> | ||
+ | |||
+ | Textual documents, such as articles, journals, papers, etc. are described using Dublin Core. | ||
=== Dublin Core and Business metadata === | === Dublin Core and Business metadata === | ||
+ | The [[EA-CoP_Data_Access_and_Sharing_Policies|EA-CoP Data Access and Sharing Policies]] document contains a proposal for associating all shared datasets in iMarine with a descriptive and standard set of metadata, the '''business metadata'''. | ||
+ | |||
+ | *Which should be the most appropriate format? | ||
+ | *Which standards can iMarine support? | ||
+ | |||
+ | |||
The Dublin Core vocabulary is a potential candidate for supporting the Business metadata in iMarine. | The Dublin Core vocabulary is a potential candidate for supporting the Business metadata in iMarine. | ||
Line 21: | Line 30: | ||
*Title* = dc:Title | *Title* = dc:Title | ||
*Publisher = dc:Publisher | *Publisher = dc:Publisher | ||
− | * | + | *Creation date = dcterms: Created |
*Last update date* = dc:Date | *Last update date* = dc:Date | ||
*Expiry date = dcterms:Valid | *Expiry date = dcterms:Valid | ||
*Contact | *Contact | ||
− | *Copyright licenses = dc:Rights and/or dcterms:AccessRights and/or dcterms:RightsStatement etc. (Rights management, Creative Commons License type or other licenses) | + | *Copyright licenses = dc:Rights and/or dcterms:AccessRights and/or dcterms:RightsStatement and/or dcterms:License etc. (Rights management, Creative Commons License type or other licenses) |
*Content description = dc:Description (e.g. Data aggregation level) | *Content description = dc:Description (e.g. Data aggregation level) | ||
*Spatial Scale = dcterms:Spatial (Spatial characteristics of the resource. Geographic level at which the content is applied) | *Spatial Scale = dcterms:Spatial (Spatial characteristics of the resource. Geographic level at which the content is applied) | ||
− | * | + | *Coverage = dcterms:Coverage and/or dc:Subject (Geographical coverage, Topic coverage, etc.) |
*Language* = dc:Language | *Language* = dc:Language | ||
*Custom bibliographic citation = dcterms:bibliographicCitation | *Custom bibliographic citation = dcterms:bibliographicCitation | ||
Line 38: | Line 47: | ||
== Darwin Core (DwC) == | == Darwin Core (DwC) == | ||
Darwin Core matadata http://rs.tdwg.org/dwc/ are supported by iMarine web infrastructure for handling information related to taxa, their occurrence in nature as documented by observations, specimens, samples, and related information. As per DwC definition ".. It is meant to provide a stable standard reference for sharing information on biological diversity. As a glossary of terms, the Darwin Core is meant to provide stable semantic definitions with the goal of being maximally reusable in a variety of contexts...". | Darwin Core matadata http://rs.tdwg.org/dwc/ are supported by iMarine web infrastructure for handling information related to taxa, their occurrence in nature as documented by observations, specimens, samples, and related information. As per DwC definition ".. It is meant to provide a stable standard reference for sharing information on biological diversity. As a glossary of terms, the Darwin Core is meant to provide stable semantic definitions with the goal of being maximally reusable in a variety of contexts...". | ||
+ | |||
+ | Biological and ecological data are made available in Darwin Core. DwC is a de-facto standard to represent data and metadata; | ||
+ | Taxonomic information are made available in Darwin Core Archive format. DwCA is a de-facto standard to deliver archives of data expressed in DwC; | ||
== Geographic information (ISO/TC211, OGC) == | == Geographic information (ISO/TC211, OGC) == | ||
+ | |||
+ | Geo-referenced data are described using ISO 19115/19119 and made available through the OGC protocols: WMS, WCS, WFS, etc. This potentially target all data having a geographic dimension, including Biological and ecological data that can be also enriched with ISO 19115/19119 geographic metadata if they include a geographic coverage. | ||
+ | |||
+ | === ISO standards === | ||
Three main types of geographic abstract metadata (approved or draft) ISO/TC 211 Standards, can be listed: | Three main types of geographic abstract metadata (approved or draft) ISO/TC 211 Standards, can be listed: | ||
− | * ISO 19115:2003 - Dataset Metadata | + | * ISO 19115:2003 / 19115-1:2014 - Dataset Metadata |
* ISO 19119:2005 - Service metadata | * ISO 19119:2005 - Service metadata | ||
* ISO 19110:2005 - Feature Cataloging | * ISO 19110:2005 - Feature Cataloging | ||
− | === Abstract specifications === | + | ==== Abstract specifications ==== |
− | ==== ISO 19115:2003 - Dataset Metadata ==== | + | ===== ISO 19115:2003 - Dataset Metadata ===== |
The ISO 19115:2003 standard is the metadata standard approved by OGC, and is composed by 2 parts: | The ISO 19115:2003 standard is the metadata standard approved by OGC, and is composed by 2 parts: | ||
− | * Part 1 (ISO 19115-1): base ISO metadata standard for the description of geographic information and services. | + | * Part 1 (ISO 19115-1): base ISO metadata standard for the description of geographic information and services. A revision of this standard was recently released by ISO: ISO 19115-1:2014 and needs to be taken into consideration. |
* Part 2 (ISO 19115-2): extension for imagery and gridded data & instrument-based data collection. These extensions also include improved descriptions of lineage and processing information. | * Part 2 (ISO 19115-2): extension for imagery and gridded data & instrument-based data collection. These extensions also include improved descriptions of lineage and processing information. | ||
The ISO 19115:2003 is structured by a set of metadata packages: | The ISO 19115:2003 is structured by a set of metadata packages: | ||
* '''Entity Set Information''': main metadata package that contains information such as identifiers (file, parent), characterSet, language, hierarchical level, main contact (party in charge of the metadata), metadata standard name/version | * '''Entity Set Information''': main metadata package that contains information such as identifiers (file, parent), characterSet, language, hierarchical level, main contact (party in charge of the metadata), metadata standard name/version | ||
− | * '''Identification Information''': package that describes the resource described in the metadata. Part of this package is specialized for the data or service identification (ISO 19119). | + | * '''Identification Information''': package that describes the resource described in the metadata. Part of this package is specialized for the data or service identification (ISO 19119). Key elements are: title, date, abstract purpose, thesaurus & asociated keywords, data use & access limitation and constraints, extent (geographic, temporal, vertical) |
* '''Constraints Information''': package required for managing rights to information including restrictions applied to the resource access and/or use. Can apply to both Entity Set (metadata) and Identification (resource) | * '''Constraints Information''': package required for managing rights to information including restrictions applied to the resource access and/or use. Can apply to both Entity Set (metadata) and Identification (resource) | ||
* '''Data Quality Information''': package required to give information on the quality of the resource. | * '''Data Quality Information''': package required to give information on the quality of the resource. | ||
Line 68: | Line 84: | ||
* '''Application schema information''': package defining the application schema used | * '''Application schema information''': package defining the application schema used | ||
− | ==== ISO 19119:2005 - Service metadata ==== | + | ===== ISO 19119:2005 - Service metadata ===== |
The describes the web-services such as the OGC Web-Services (OWS), usable in association with the ISO 19115 standard. | The describes the web-services such as the OGC Web-Services (OWS), usable in association with the ISO 19115 standard. | ||
− | ==== ISO 19110:2005 - Feature Cataloging ==== | + | ===== ISO 19110:2005 - Feature Cataloging ===== |
The ISO 19110 standard is aimed to address the description of feature types / coverage usable in association with the ISO 19115 standard. | The ISO 19110 standard is aimed to address the description of feature types / coverage usable in association with the ISO 19115 standard. | ||
− | === Implementations === | + | ==== Implementations ==== |
The ISO 19139 standard defines the XML schema implementation of the above abstract metadata standards. | The ISO 19139 standard defines the XML schema implementation of the above abstract metadata standards. | ||
+ | |||
+ | === OGC standards === | ||
+ | |||
+ | The OGC Web-Service GetCapabilities is widely used as reference for service metadata & description of layers, as it can provide lot of information if enough enriched. One key element worth mentioned is the service/dataset relationship implemented through the MetadataURL links (analogy with the ISO 19119 service "operatesOn" relationship). | ||
+ | |||
+ | In addition, the OGC standard AuthorityURL and identifiers GetCapabilities elements are assets as they provide a place for handling code/identifier mappings from different authoritative Information Systems. An example of mappings that can be handled in the GetCapabilities is the mapping between metadataURL (from a metadata catalogue) and a coded entity (from a Linked Open Data), that can be used to enrich Linked Open Data with geographic references. | ||
=== Mapping with other metadata standards === | === Mapping with other metadata standards === | ||
Line 89: | Line 111: | ||
==SDMX== | ==SDMX== | ||
+ | Statistical Data are made available in SDMX that includes specific metadata for agencies, code lists, and datasets. | ||
*What SDMX can hold in terms of Metadata? | *What SDMX can hold in terms of Metadata? | ||
Line 95: | Line 118: | ||
==FLUX == | ==FLUX == | ||
− | == | + | ==Resources == |
− | + | *[[EA-CoP Data Access and Sharing Policies]] | |
− | + | *[[Ecosystem Approach Community of Practice: iMarine Guidelines and Best Practices]] | |
− | * | + | *[[Content citation]] |
− | * | + | *[http://www.w3.org/TR/void/#license W3C license of a dataset (from Describing Linked Datasets with the VoID Vocabulary)] |
Latest revision as of 11:06, 10 July 2014
Here follows relevant information and discussions for the identification of sets of Metadata which could be supported by iMarine. A specific subset of Metadata, the Business metadata, should be identified for supporting the implementation of the EA-CoP Data Access and Sharing Policies.
The iMarine web infrastructure can support fully or partially different types of metadata according to requirements. I.e. standard widely used among EA-CoPCommunity of Practice. and indispensable for optimized use of the iMarine infrastructure and its many data management and processing capacities; Open Source community committed to support with software components the further development of this standard.
Dublin Core (DC)
The Dublin Core® Metadata Initiative, or "DCMI", is an open organization supporting innovation in metadata design and best practices across the metadata ecology.
See more at http://dublincore.org
Textual documents, such as articles, journals, papers, etc. are described using Dublin Core.
Dublin Core and Business metadata
The EA-CoP Data Access and Sharing Policies document contains a proposal for associating all shared datasets in iMarine with a descriptive and standard set of metadata, the business metadata.
- Which should be the most appropriate format?
- Which standards can iMarine support?
The Dublin Core vocabulary is a potential candidate for supporting the Business metadata in iMarine.
Here follows an initial proposal for possible utilization of DC elements:
- Owner* = dcterms:RightsHolder
- Context* = dcterms:Collection and/or dcterms:Dataset Authorship (Context or Subject area gives the scope in which a content is positioned. As a data collection, it indicates to which aggregation of resources the content is belonging to.)
- Author* = dc:Creator
- Title* = dc:Title
- Publisher = dc:Publisher
- Creation date = dcterms: Created
- Last update date* = dc:Date
- Expiry date = dcterms:Valid
- Contact
- Copyright licenses = dc:Rights and/or dcterms:AccessRights and/or dcterms:RightsStatement and/or dcterms:License etc. (Rights management, Creative Commons License type or other licenses)
- Content description = dc:Description (e.g. Data aggregation level)
- Spatial Scale = dcterms:Spatial (Spatial characteristics of the resource. Geographic level at which the content is applied)
- Coverage = dcterms:Coverage and/or dc:Subject (Geographical coverage, Topic coverage, etc.)
- Language* = dc:Language
- Custom bibliographic citation = dcterms:bibliographicCitation
- Media type = dcterms:MediaType
- Identifier = dc:Identifier (e.g. URL of the resource)
"*" Mandatory metadata
Darwin Core (DwC)
Darwin Core matadata http://rs.tdwg.org/dwc/ are supported by iMarine web infrastructure for handling information related to taxa, their occurrence in nature as documented by observations, specimens, samples, and related information. As per DwC definition ".. It is meant to provide a stable standard reference for sharing information on biological diversity. As a glossary of terms, the Darwin Core is meant to provide stable semantic definitions with the goal of being maximally reusable in a variety of contexts...".
Biological and ecological data are made available in Darwin Core. DwC is a de-facto standard to represent data and metadata; Taxonomic information are made available in Darwin Core Archive format. DwCA is a de-facto standard to deliver archives of data expressed in DwC;
Geographic information (ISO/TC211, OGC)
Geo-referenced data are described using ISO 19115/19119 and made available through the OGC protocols: WMSSee Workload Management System or Web Mapping Service., WCSWeb Coverage Service, WFSWeb Feature Service, etc. This potentially target all data having a geographic dimension, including Biological and ecological data that can be also enriched with ISO 19115/19119 geographic metadata if they include a geographic coverage.
ISO standards
Three main types of geographic abstract metadata (approved or draft) ISO/TC 211 Standards, can be listed:
- ISO 19115:2003 / 19115-1:2014 - Dataset Metadata
- ISO 19119:2005 - Service metadata
- ISO 19110:2005 - Feature Cataloging
Abstract specifications
ISO 19115:2003 - Dataset Metadata
The ISO 19115:2003 standard is the metadata standard approved by OGC, and is composed by 2 parts:
- Part 1 (ISO 19115-1): base ISO metadata standard for the description of geographic information and services. A revision of this standard was recently released by ISO: ISO 19115-1:2014 and needs to be taken into consideration.
- Part 2 (ISO 19115-2): extension for imagery and gridded data & instrument-based data collection. These extensions also include improved descriptions of lineage and processing information.
The ISO 19115:2003 is structured by a set of metadata packages:
- Entity Set Information: main metadata package that contains information such as identifiers (file, parent), characterSet, language, hierarchical level, main contact (party in charge of the metadata), metadata standard name/version
- Identification Information: package that describes the resource described in the metadata. Part of this package is specialized for the data or service identification (ISO 19119). Key elements are: title, date, abstract purpose, thesaurus & asociated keywords, data use & access limitation and constraints, extent (geographic, temporal, vertical)
- Constraints Information: package required for managing rights to information including restrictions applied to the resource access and/or use. Can apply to both Entity Set (metadata) and Identification (resource)
- Data Quality Information: package required to give information on the quality of the resource.
- Maintenance Information: package that describes the maintenance and update procedure applied to the resource. Can apply to both Entity Set (metadata) and Identification (resource)
- Spatial Representation Information: package that aims to describe the spatial representation of the geographic information, either for vector or grid-based representation
- Reference System Information: package that describes the spatial and temporal reference systems used in the described resource
- Content Information: package that aims to describe the content of resource either the feature catalogue de (for vector data) or coverage (for grid data). Related to the ISO 19110 standard specification.
- Portrayal catalogue information: information about the portrayal catalogue(s) used to display the resource
- Distribution Information: package required to access the resource. This package especially aims to distribute the resources by pointing to the OGC Web-Service resource.
- Metadata extension Information: package to handle extented metadata elements
- Application schema information: package defining the application schema used
ISO 19119:2005 - Service metadata
The describes the web-services such as the OGC Web-Services (OWS), usable in association with the ISO 19115 standard.
ISO 19110:2005 - Feature Cataloging
The ISO 19110 standard is aimed to address the description of feature types / coverage usable in association with the ISO 19115 standard.
Implementations
The ISO 19139 standard defines the XML schema implementation of the above abstract metadata standards.
OGC standards
The OGC Web-Service GetCapabilities is widely used as reference for service metadata & description of layers, as it can provide lot of information if enough enriched. One key element worth mentioned is the service/dataset relationship implemented through the MetadataURL links (analogy with the ISO 19119 service "operatesOn" relationship).
In addition, the OGC standard AuthorityURL and identifiers GetCapabilities elements are assets as they provide a place for handling code/identifier mappings from different authoritative Information Systems. An example of mappings that can be handled in the GetCapabilities is the mapping between metadataURL (from a metadata catalogue) and a coded entity (from a Linked Open Data), that can be used to enrich Linked Open Data with geographic references.
Mapping with other metadata standards
- Considering the Dublin Core® vocabulary is currently seen as reference for supporting the business metadata in iMarine, it would be important to address how the different internationally-recognized domain metadata standards are mapped the Dublic Core metadata vocabulary.
- For more information on the mapping Dublin Core / ISO 19115, see:
- INSPIRE and Dublin Core / ISO15836
SDMX
Statistical Data are made available in SDMX that includes specific metadata for agencies, code lists, and datasets.
- What SDMX can hold in terms of Metadata?
- Which fields of SDMX can be mapped to Business Metadata and how?