Biodiversity Draft Work Plan Q2-3- 2012
The expectations of the Biodiversity Cluster (Mainly related to BC 2) were presented at the imarine Board meeting in Rome in March. After the meetng they were further detailed using WP3 Conference Calls, and meetings with EA-CoPCommunity of Practice. representatives. The iMarine Vice-Chair (E. Vandenberghe) was intrumental in collecting the descriptions.
Abstract or Executive Summary
One of the 4 currently identified clusters in iMarine is 'biodiversity'. The term is sufficiently vague to enable the collection of requirements that operate on, or benefit from, 'biodiversity' domain. The boundary of this domain is far from sharp, and immediate relations with e.g. the geospatial and statistical clusters are evident.
The work-plan is not domain specific, and technology neutral. It decribes how the iMarine Board can be involved in the specification of use-cases, data policies, and harmonization issues, to name a few issues.
Introduction and Background (The Problems)
The iMarine Board is responsible for the implementation of 2 Business Cases in the project, and brings a wealth of community expertise to the technical e-infrastructure. The EA-CoPCommunity of Practice. has needs to search over multiple resources, and extracted
The opportunity was presented and discussed at the imarine Board meeting in Rome (March 19-21), and later furtehr elaborated and discussed by project partners with EA-CoPCommunity of Practice. input.
Goals and Objectives (The Outputs)
The cluster discussion at the imarine board meeting was summarized by OBIS (Edward Vanden Berghe). Five goals were identified as products that can be delivered, for each of these, an initial set of objectives emerged that require further discussion. The goals are : Taxon name access, Taxon name reconciliation, Occurrence data access, Occurrence data reconciliation, and Occurrence data enrichment.
Taxon Names Access
The work of CNR will have to be reviewed and validated once the service is completed by the end of April 2012.
Taxon Name Reconciliation
Currently, OBIS (Edward) uses SQL statements to merge taxonomic lists. This is based on a number of rules that marshall the merging.
The proposed service will produce a list of pairs of Taxa each with a probability of similarity among the two Taxa; CNR and FIN will take the leadership of specifying services for Taxon Data; FAO has developed a very similar tool for vessel disambiguation that includes a well designed UI. This will be reviewed too.
Data availability depends on the number of plugins the infrastructure is equipped with, one plugin for each data source / provider. It is therefore evident that the first Use Case has to be operational.
Occurrence data access
CNR is already implementing a first schedule:
- first develop occurrence points data access, i.e. work on services giving access to occurrence points from a number of data providers. The occurence data service will be based on a species name, and spatial and temporal parameters.
- then, in a second phase the taxonomic data ;
Occurrence Data Reconciliation
There may be overlaps and gaps between the datasets contained in 2 (or more) repositories. With millions of occurence records, support is needed to identify both the gaps and overlaps, not only at data level, but also at dataset level.
- OBIS and GBIF can serve data through an 'occurrences service';
- The project partners have to consider how to define an 'occurrence service' for 'singleton'/'duplicate' identification;
CNR has plans to initiate work in May.
Occurrence data enrichment
By the end of April CNR expects to complete the activity on 'occurrence point access'. The enrichment will come in a successive phase, also since this depends on results of other clusters (namely the Geo-spatial one);
The ocurence data enrichment would see a user use a service that, either in on-line or in batch mode, takes a set of spatio-temporal parameters, and a set of occurence points, and queries and external environmental data repository to extract geospatial explicit information. For example, for 10.000 points, the nearest 1000 Sea Surface temperatures are interpolated over a 1 month period, and returned as average, max, min, std for each point.
For outliers flagging on land gazetteers are available, however, in a marine environment the notion of space is different, and iMarine can contribute truely innovative solutions.
CNR sees a role for the other project partners and the imarine Board to guide the classification of Occurrence Points, e.g. survey data rather than specimen.
Resources and Constraints (The Inputs)
The iMarine project was designed with a clear vision on the need for semantic technology support to chellenging scenarios. It also anticipated that specialized resources would have to be identified after the project started, e.g. in establishing collaborations with specialized departments in project partners' institutions (FAO, IRD), and related EA-CoPCommunity of Practice. projects such as with AgInfra.
A quick and complete assessment of needs and constraints can only be made once such collaborations have stabalized.
The resources from the project would include:
OBIS - Use case description, data provider and developer.
FAO - Use case description, data provider
CNR - Tools and application provider, developer
CRIA - Tools and application provider, developer
FIN - .....
Specific constraints are the low level of expertise in gCube technolgy development in th EA-CoPCommunity of Practice. and with some partners that have developed biodiversity tools. In addition, many data are volatile or incomplete, and will require specialized curation.
Strategy and Actions (from Inputs to Outputs)
The goals and objectives have been defined and discussed at the iMarine Board meeting in Rome in March. Here, it was also decided that a biodiversity cluster be established to define objectives, and prepare outlines for VREVirtual Research Environment.'s, applications and services. These will then be presented to the iMarine Board and the wider EA-CoPCommunity of Practice. (May 2012).
In June, the results from the EA-CoPCommunity of Practice. consultation will be discussed at the TCom, to establish feasibility, usability, and usefulness of the identified Use Cases and components.
The feed-back from the TCom and technical boards will then be discussed with the iMarine Board and selected EA-CoPCommunity of Practice. representatives for follow-up ations.
Meanwhile, project partners already can spend effort on the first 3 Use Cases to support; the data access to biodiversity data repositories, and discovery and dowload of species occurence data; bis bis
Appendices (Planned Effort, Resources, Documents, Schedule and Others)
Planned effort & resources
FAO
OBIS
CNR
CRIA
FIN ...