Difference between revisions of "Use Cases for EA-CoP Data Access and Sharing Policies"

From D4Science Wiki
Jump to: navigation, search
(iMarine EA Linked Open Data Initiative)
(Policy)
Line 57: Line 57:
 
* Define the principles prevailing for the implementation of LOD
 
* Define the principles prevailing for the implementation of LOD
 
* Define the responsibilities of the various actors involved
 
* Define the responsibilities of the various actors involved
 +
Accomplishing the goad of the EA-iMarine-LOD involves FAO, CNR and a data provider (i.e. the partners committed to the initiative )
 +
FAO: is responsible for issuing the policies of participation, and coordinating the activities of design and development of the tools for LOD engineering, finally to lead the creation of the core of the EA-iMarine-LOD.
 +
CNR: is responsible for the implementation and service provider of the capabilities shipped with the tools for LOD engineering trough the infrastructure, hosting the storing facilities of the EA-iMarine-LOD and to be the main publisher of its content.
 +
The data provider: is responsible to instantiate a data access mechanism for the data to be engineered as LOD, also it will be involved in to choosing the relationships and concepts that best describe its domain, to the extent that is of interest to it. Also the data provider is responsible to provide a mechanism for persistent URIs, and to choose a publication channel for the engineered LOD dataset whether in house or through the infrastructure.
 
* Define the type of collaborations
 
* Define the type of collaborations
 
* Define the kind of support required
 
* Define the kind of support required

Revision as of 07:44, 8 March 2013

Here follows a list of use cases for which the EA-CoP Data Access and Sharing Policies may apply.


Template for the Policy use cases

Strategy

The Strategy chapter positions the concerned Use Case in the broader context of iMarine objectives (draw a link to relevant Wiki page in case the strategy has already been defined elsewhere):

  • Define the initiative and set the Goal
  • Identify the benefits
  • Position the role of the iMarine plaftform in respect of the concerned Use Case

Policy

The Policy chapter

  • Define the properties/set of quality(*) required to achieve the specific objective
  • Define the principles prevailing for the concerned use case; these principle refer to the general iMarine data sharing Policy (Disclaimer, Copyright, Posting Content, Shared Data, Public Data, Secondary Use, Derivative Work, and Data Citation) and might extend these
  • Define the responsibilities of the various actors involved
  • Define the type of collaborations required
  • Define the kind of support required

Guidelines

The policy is extended with Guidelines:

  • Metadata and models implementing the use case; this chapter includes a mapping with the business metadata
  • Editorial workflow
  • Roles and responsibilities in the workflow


(*) e.g. from FACP policy document: Objectivity; Reliability and timeliness; Length versus comprehensiveness; Hard copy – versus electronic format; Languages and translations; Partnerships

Code list management

content to be provided by Yann Laurent and Anton Ellenbroek, based on the various sub-use cases (FAO, Smartfish, FishFrame, DG MARE, eRS-FLUX)

iMarine EA Linked Open Data Initiative

content to be provided by Claudio Baldassarre, Julien Barde, and Anton Ellenbroek

Strategy

The Strategy chapter positions LOD in the broader context of iMarine objectives:

  • Define the initiative and set the Goal

The EA-iMarine-LOD, initiative promoted by FAO, is meant to develop the necessary capacities in the infrastructure, to instantiate a network of interlinked datasets of scientifically accurate data in the domain of EA to fisheries. The EA-iMarine-LOD is contributed in portions by the partners willing to be part in return of mutual data enrichment. Such distributed evolution paradigm fits well with the structure of network of interlinked datasets.

The goal of this initiative is to supply what providers lack: the resources, the technical expertise, or when they can’t find the proper tools. In doing so the initiative will set a plan of development for LOD engineering tools built on data access facilities. Innerly the goal of this initiative is to help to overcame the challenges of LOD engineering that is demanding and requires actions beyond the simple creation of datasets ( e.g. complex ETL workflows, and to be bound to full dataset lifecycle)

  • Identify the benefits

LOD engineering and maintenance goes way beyond publishing data in RDF (e.g. as WoRMS currently does via TDWG services). LOD datasets requires to be as densely interlinked as possible innerly, and with external LOD dataset. The return on investment on good quality LOD engineering is the possibility to become part of fast growing network of datasets in the EA to fisheries domain, created by institution or even citizen scientists. On top of the network of distributed LOD datasets scenarios of interoperable systems, or integrated information retrieval environment can be engineered.

    • SMARTIFISH Regional Information System (RIS)

This web application is one-stop-shop for users that require a comprehensive view on Fisheries in the SWIO area encompassing aspects of conservation plans, fishery gears and vessels, marine species, catches and beyond. Of the partners participating in to SMARTFISH project, none of them produces information covering the initial requirements, but together they do. An unobtrusive solution to develop a portal that publishes the integrated data from three information systems, is to define the integration outside in to a network of shared entities made globally unique and dereferenceable in the scope of the RIS. The network of relationships is defined to capture the knowledge of the domain (i.e. the SWIO fisheries), the application requirements (i.e. oriented to presentation of high level fisheries concepts), and ultimately a core of scientifically accurate data (i.e. codes and code lists). The final result is a LOD based knowledge repository used to annotate (i.e. represent) the information resources in the remote information systems, as well as harmonize the level of heterogeneity in the terminology adopted locally by the system being part of the RIS. The approach to integration achieved trough LOD, make it so that extension to data from new information system requires essentially an extension of the network of entities and relationships. Similarly an evolution of the requirements for the RIS requires an update to the LOD based knowledge repository that sits separated from the integrated information systems.


  • Position the role of the iMarine plaftform in respect of the LOD

The iMarine platform, for both the VREVirtual Research Environment. and the infrastructure, plays a key role in the production and consumption of LOD inside and outside the project. The root motivation is that EA to fisheries requires the combination of data gathered from multiple contexts, and most of all these data reference each other in the same way ecosystem relates organisms. When the co-referencing is captured in a reusable and shared network of semantic computational relationships, this produces a knowledge asset which was not previously existing. More over the processes happening in the VREVirtual Research Environment. set in place additional knowledge to be publish as an extension of the EA-iMarine-LOD. As stated at the beginning there is a bootstrap phase for good quality engineering of LOD that requires expertise and software tools to process the production of linked data, and their maintenance over time. When this capabilities are delivered to the iMarine platform users, the project is in charge to instantiate the core of the EA-iMarine-LOD that is a valuable set of mapping that consolidates the interconnection through code lists of entities. This is otherwise only achievable through creation of yet another massive data container and poorly reusable for multiple application requirements, as it requires highly impacting schema refactoring, with respect to pluggable schema to accommodate requirements extension achievable with LOD based knowledge repositories. By maneuvering in this core set of mapping the project can lead activities of information systems interoperability, or information mash-up environments, data harmonization, dissemination of public URIs for content annotation at web scale. In short the key role of the iMarine platform is the mediator for iMarine partners that wants to ship their data in to the LOD cloud, and produce itself a node that is the hub to become the reference of scientifically accurate data mapping.

Policy

The Policy chapter The EA-iMarine-LOD is centered in a core of mapping relationships among the LOD datasets contributed by project partners. This definition requires a preparatory phase of LOD dataset engineering for those who do not have resources, expertise or the tools to underpin such LOD production. The configuration of the EA-iMarine-LOD components, and with respect to its core, can be described with 2 areas or data realms. In the first realm are the LOD datasets strictly produced in iMarine (e.g. WoRMS, EEZ, GIBIF, CoL etc). This is because the LOD engineering is controlled and the specifications negotiated among the people part of iMarine. The LOD datasets in the first realm are directly connected to and through the core of EA-iMarine-LOD. In the second realm are the LOD datasets produced externally to the project (e.g. AGROVOC, data.fao.org, BIO-LOD etc). This realm is an interface to the LOD cloud and is filled with interconnections instantiated from the LOD datasets in the first realm outward in the cloud. The second realm is the gateway to and from the core of EA-iMarine-LOD and the cloud allowing external communities to refer project and partners data, and as such intensifying the concentration of incoming references in to EA-iMarine-LOD.

  • Define the principles prevailing for the implementation of LOD
  • Define the responsibilities of the various actors involved

Accomplishing the goad of the EA-iMarine-LOD involves FAO, CNR and a data provider (i.e. the partners committed to the initiative ) FAO: is responsible for issuing the policies of participation, and coordinating the activities of design and development of the tools for LOD engineering, finally to lead the creation of the core of the EA-iMarine-LOD. CNR: is responsible for the implementation and service provider of the capabilities shipped with the tools for LOD engineering trough the infrastructure, hosting the storing facilities of the EA-iMarine-LOD and to be the main publisher of its content. The data provider: is responsible to instantiate a data access mechanism for the data to be engineered as LOD, also it will be involved in to choosing the relationships and concepts that best describe its domain, to the extent that is of interest to it. Also the data provider is responsible to provide a mechanism for persistent URIs, and to choose a publication channel for the engineered LOD dataset whether in house or through the infrastructure.

  • Define the type of collaborations
  • Define the kind of support required

Guidelines

The policy will be extended with Guidelines:

  • Metadata and models
  • Editorial workflow
  • Roles and responsibilities in the workflow

Taxonomic data

the WORMS use case

Policy to stem from the MoU to be elaborated between iMarine and VLIZ

Sharing taxonomic data

Edward VandenBerg and Nicolas Bailly as part of Edward's TORs

FAO Species fact sheet VREVirtual Research Environment. - App'lifish

Content to be input by Aureliano/Ellenbroek

Geospatial data and OGC Web-Services

  • The geospatial data will be shared through OGC Web-Services (OWS)
    • Access to geospatial data as OGC WMSSee Workload Management System or Web Mapping Service./WFSWeb Feature Service/WCSWeb Coverage Service resources
    • Access to resources provided through ISO/IC211 - OGC metadata, served by CSW web-service with 2 access levels:
      • service metadata (ISO 19119:2005 / 19139) describing WxS instances specific to the data collection
      • dataset metadata (ISO 19115:2003 / 19139) describing each dataset
    • Such access could be completed later by Feature Catalogue description (ISO 19110), for data processing needs.
    • The set of metadata will be published in a CSW catalogue shared in i-Marine through Harvesting operation.
  • A metadata constraints section will be added to the metadata, and will specify the license applicable to the data collection.
  • Example: case of FAO aquatic species distributions, published through the FAO Geonetwork

Others

  • ... to be input by Aureliano/Ellenbroek

Resources selector

Content to be provided by Ellenbroek

SmartFish

Content to be provided by Laurent