Geospatial cluster

From D4Science Wiki
Revision as of 13:15, 6 February 2013 by Herve.caumont (Talk | contribs) (Strategy and Actions (from Inputs to Outputs))

Jump to: navigation, search

The main purpose of the Cluster work plan is to provide the iMarine Board with a management tool usable as a framework for planning activities, and that can serve as a guide for carrying out that work. The scope is thus the interface between the Board and the project's Work Packages activities. After drafting, a work plan needs approval from the iMarine Board, following the Board procedures.

Executive Summary

The iMarine Geospatial Cluster is maintaining and promoting a Work Plan (this document) aimed at:

  • organizing collections of requirements gathered from the iMarine Business Cases
  • providing recommendations for the implementation of the iMarine infrastructure.

The requirements are inputs for the cluster, from iMarine Business Cases that are grouped as follows:

  • the EU Common Fishery Policy
  • the FAO deep seas fisheries programme
  • and the UN EAF Ecosystem Approach to fisheries

The recommendations are outputs from the cluster, primarily intended for the iMarine Board, the iMarine project partners (Work Packages) and the Communities of Practice (CoPCommunity of Practice.) identified within the Ecosystem Approach. They are aimed at releasing infrastructure services such as:

  • Enrichment of Species Occurrences data with profiles of environmental parameters
  • Comparison of Species Distribution Maps

Such Infrastructure Services are needed by the iMarine eScience services (VREs & Apps).

Introduction and Background (The Problems)

For many biological observations, we have no data on the prevailing environmental conditions; either this information was never recorded (as is the case for many museum specimens, especially older ones), or the data was collected, but by others than the biologists, and different data streams were never re-united. Digging though archives of sampling campaigns of many years ago is tedious, if not impossible, by loss of essential information on the sampling event”.

Edward Vanden Berghe, Executive Director, Ocean Biogeographic Information System, April 2012

Environmental conditions such as salinity, temperature, or acidity, … are essential for conducting studies and developing applications related to Marine Species Distributions. Nevertheless, it is still a tedious task to collect and present coherently such parameters to the scientists. Through the activities of its Geospatial Cluster, the iMarine project is developing an approach and infrastructure tools in order to tackle this issue. Such activities aim at leveraging the spatial and temporal dimensions of environmental measurements, both being the bridges that can join environmental variables with the current assessments on species occurrences.

Way forward

Oceanographers have created large archives of data, the majority of them publicly accessible. Some examples discussed or assessed within the iMarine project:

Also, there are vast libraries of remotely-sensed data, some of them being in the public domain. Mining these data sources enables to reconstruct the environmental conditions in the neighbourhood and at the time of the biological observations.

It is unlikely that there will be a complete collection of environmental data for each and every point of interest corresponding to biological observations (e.g. latitude, longitude, depth and time for the OBIS observations). So we will need a step of interpolation between the existing environmental data. This interpolation can be either a statistical interpolation, or based on a model of the variable of interest.

Some further thoughts

A statistical interpolation will most probably be based on a weighted average of measurements of the parameter under consideration in the neighborhood of our 4D point of interest. The problem is that the weighting has to be done over dimensions that do not all behave the same – most obvious is the difference between spatial dimensions and time. I am not sure which models exist for the parameters we’re interested in, and how easily they would be available for our work. There is a difference between remote sensing data and in-situ data – remote sensing data is, first of all, spatially only 2D, which dramatically reduces complexity; and their geographic scope is usually very large, which means that we probably have measurements close to our points of interest. Another type of 2D data is bathymetry; here, the data do not change much in time (at least not on time scales we’re interested in), so we have to deal with only a single ‘layer’. The main source of in-situ data I am aware of is the World Ocean Database, and the Word Ocean Atlas which is derived from the WOD. Both are maintained by the World Ocean Data Center in Silver Spring, near Washington DC; the WDC is operated by the National Oceanographic Data Center of the USA, which is part of NOAA. Obviously, the WDC people know how to create the WOA based on the raw data from the WOD; we might want to look for their collaboration (I have some good contacts there).

What are the environmental variables of interest?

Bathymetry – is easily available, and in a resolution that is sufficient for our purposes; several sources: ETOPO, GEBCO. We could derive some extra parameters from bathymetry, like distance from continent, rugosity or aspect, but these are lower priority. Salinity and temperature – the classic in-situ data. There’s been a lot collected, mainly because these two parameters are influencing the speed of sound in water, so are needed for interpretation of sonar signals – in other words, they have military implications. For this reason, some countries refuse to make the data in their coastal waters public; but there is still *a lot* of data around. And for biodiversity and environmental envelope modelling, they are a priority. pH – very important if we want to be able to deal with global change, including ocean acidification. It’s an in-situ measurement, and not all that much is available. Ocean colour, productivity – the first is used as a proxy for the second; ocean colour reflects chlorophyll. Remote sensing data, so should be relatively easy to deal with? Nutrients – in situ, not as well measured as salinity and temperature, or even pH; lower priority? But is available, just as temperature, salinity and pH is WOA and WOD, so might be low hanging fruits.

Goals and Objectives (The Outputs)

Outputs are Roadmaps, Tradeoff analysis and Guidelines for the development, deployment and maintenance of infrastructure services such as:

  • Enrichment of Species Occurrences data with profiles of environmental parameters
  • Comparison of Species Distribution Maps

Such Infrastructure Services are needed by the iMarine eScience services (VREs & Apps) and other web service endpoints.

A validation process aims at matching the cluster outputs with 'consuming' eScience services like these ones:

  • [VTI]: a VREVirtual Research Environment. for Vessel Transmitted Information management and modeling
  • [ICIS]: a VREVirtual Research Environment. for Timeseries management
  • ICIS - SPREAD: a VREVirtual Research Environment. extention for Geospatial Reallocation features
  • AquaMaps: a VREVirtual Research Environment. for species predictive modeling
  • ...

To date, it remains a demanding task for the Geospatial cluster to well identify and match which are relevant iMarine eScience services. A view on the status of the iMarine eScience services maintained here: http://wiki.i-marine.eu/index.php/Ecosystem_Approach_Community_of_Practice:_VRE_planning

Resources and Constraints (The Inputs)

The Business Cases requirements are inputs for the cluster, they come from 3 Business Cases that are grouped as follows:

  • the EU Common Fishery Policy
  • the FAO deep seas fisheries programme
  • and the UN EAF Ecosystem Approach to fisheries

Other inputs

Data sources FAO Tuna Atlas (Tropical tuna data) IRD Tuna Atlas (Tropical tuna data)

Other Fisheries data: (catches of fisheries targeting tuna, bycacth of tuna fisheries scientific tagging data),

FAO Global and Regional datasets

Species distributions, occurrences data of other fisheries databases, Environmental data and models outputs (physical, chemical, biological parameters) that are both managed in netCDF: SST, Wind, Clorophyll,etc Other GIS relevant products, e.g. FAO Intersection Engine products

Constraints

Strategy and Actions (from Inputs to Outputs)

From the strenghts and skills of the iMarine partners contributing to the Geospatial Cluster, the following action plans have been conducted or are underway:

  • Leveraging the Thredds servers base
  • Implementing the OGC WPS specification
  • Leveraging the OGC OWS Context 1.0 for sharing resources of interest from a research activity or experiment, and that can be consumed by an application (processing, visualization...)
  • Leveraging the Hadoop Processing framework
  • Legacy applications: wrapping the processors (WPS Hadoop deployment Use Cases)
  • Leveraging and/or implementing OGC/ISO Metadata standards, metadata encoding towards data and services sharing
  • Thematic Mapping
  • ...

For each of them, it is envisioned (by January 2013) to review and benchmark their added-value accordingly to the following iMarine standard review:

  • Who are the Users
  • Who are the co-funding partners
  • What are the iMarine infrastructure resources involved
  • What are the outcomes that do match the iMarine Description of Work
  • How do they fit in the EA-CoPCommunity of Practice. business cases
  • How do they contribute to the sustainability of an EA-CoPCommunity of Practice.
  • How far are they re-usable with clear benefits to EA-CoPCommunity of Practice. representatives, and proven compatibility with EA-CoPCommunity of Practice. resources
  • How far are they consistent with EC regulations/strategies such as INSPIRE

Appendix A - Resources

Software

Computing resources

Connectivity

External Services endpoints

iMarine Partners Services endpoints

iMarine infrastructure and eScience services (developement)

iMarine infrastructure and eScience services (validated)

Algorithms / Processors / Scientific Applications

Expertise areas

...

Appendix B - Budget

Appendix C - Schedule

The Geospatial Cluster aligns its work plan to its primary 'customer' milestones, that are the planned iMarine Board meetings, appointed through the life-time of the iMarine project:

  • Semester 1;
    • Mobilization phase: identification of opportunities for collaboration and technologies
    • Geospatial Cluster support:
  • Semester 2;
    • Stabilization phase: validation of opportunities and definition of the technology scope
    • Geospatial Cluster support:
  • Semester 3;
    • Experimentation phase: with technologies, and with expansion of the EA-CoPCommunity of Practice. user base
    • Geospatial Cluster support:
  • Semester 4;
    • Validation phase: collaboration structures and EA-CoPCommunity of Practice. requirements consolidation
    • Geospatial Cluster support:
  • Semester 5;
    • Exploitation phase: operations through EA-CoPCommunity of Practice. collaboration frameworks
    • Geospatial Cluster support:

Appendix D - Documents

Working draft documents

12/12/03 - Thematic Mapping Engine, Time series Map visualization, v2.2, Emmanuel Blondel (FAO), Y.Laurent, F.Brito (Terradue)

13/01/22 - Guidelines for Data and Service Providers, Julien Barde (IRD), Norbert Billet (IRD), Emmanuel Blondel (FAO)

Approved documents

...

Appendix E - Other

iMarine Technical Guidelines (gCube Wiki)

Geospatial data Discovery https://gcube.wiki.gcube-system.org/gcube/index.php/Geospatial_Data_Discovery

Geospatial data Processing https://gcube.wiki.gcube-system.org/gcube/index.php/Geospatial_Data_Processing

Geospatial data Visualization https://gcube.wiki.gcube-system.org/gcube/index.php/Geospatial_Data_Visualization

Legacy Applications integration https://gcube.wiki.gcube-system.org/gcube/index.php/Legacy_applications_integration

OGC/ISO Publishing guidelines for Data Producers http://wiki.i-marine.eu/index.php/Geospatial_cluster_guidelines (draft page, needs to be renamed)

Biodiversity data Assessment, Harmonisation, and Certification https://gcube.wiki.gcube-system.org/gcube/index.php/Occurrence_Data_Enrichment_Service

Biodiversity data Assessment, Harmonisation, and Certification https://gcube.wiki.gcube-system.org/gcube/index.php/Occurrence_Data_Reconciliation

iMarine Governance Rules

iMarine Guidelines and Best Practices http://wiki.i-marine.eu/index.php/Ecosystem_Approach_Community_of_Practice:_iMarine_Guidelines_and_Best_Practices

iMarine Data Access and Sharing Policies http://wiki.i-marine.eu/index.php/EA-CoP_Data_Access_and_Sharing_Policies

iMarine Clusters and Boards

The Clusters' coordinated Work Plans for the Boards http://wiki.i-marine.eu/index.php/Ecosystem_Approach_Community_of_Practice_Overview:_Clusters#Cluster_Work_Plans_in_iMarine

iMarine CoPCommunity of Practice. EA (Ecosystems Approach)

The business cases http://wiki.i-marine.eu/index.php/Ecosystem_Approach_Community_of_Practice:_iMarine_Business_Cases

The iMarine eScience services: VREs, Apps, Web Services endpoints http://wiki.i-marine.eu/index.php/Ecosystem_Approach_Community_of_Practice:_VRE_planning