Ecosystem Approach Community of Practice: NEAFC2ICES

From D4Science Wiki
Jump to: navigation, search

NEAFC has been offered a sub-contract to assist the iMarine consortium with the development of a set of services that enables NEAFC to securely share VMS information with selected staff of ICES.

Some of these facilities can not be provided by Virtual Research Environments, as the data and logic to feed these services can not leave the NEAFC premises. Therefore, a facility is required that can be installed, configured and controlled by NEAFC staff in the NEAFC infrastructure. This facility would be the 'gatekeeper' to the NEAFC repository.

Similar 'gatekeepers' or more often needed, and a generic approach to communicate with RDBMS systems is proposed. Examples of RDBMS connectors are e.g.

  • In the iMarine OpenSDMX initiative, example connectors have already been delivered (DB2SDMX);
  • The FAO VME-DB project, where a editing phase fro VME reports in the iMarine reporting tool is foreseen;
  • The French Data Collection Framework (DCF), where DCF to FishFrame converters are available in JAVA from IRD;
  • The VALID proposal to use iMarine services to assess the contents of single records or larger recordsets;
  • Biodiversity datasets such as from OBIS where a direct JDBC connector was used.
The NEAFC subcontract is proposed to further develop these 'connectors', and can commence in Q3 2013.
Product
The services sought by NEAFC should implement secure data exchange with ICES. It will result in a VREVirtual Research Environment. for data browsing and storing for registered ICES staff. The main functions will be to
# develop a query tool to generate anonymized datasets from data stored in the NEAFC infrastructure, with time parameters
## Time period, as only semestral reports are produced by NEAFC. The details are described in the MarineDataExchange page. 
At a second stage, ICES staff will have to identify how these semestral reports in the infrastructure should be filtered and aggregated to produce the data products required by them. These could include:
## Target Species filtering;
## Fishing area (EEZ / BBox / FAO), 
## Vessel group identifiers such as flag state, 
## Activity identifiers such as Specific measure, 
## Other group characteristics, such as gear, target species, or landing site
# transform data to an ICIS recognizable format, as per ICES requirements, 
# publish these anonymized datasets in an access controlled repository in the iMarine infrastructure (encrypted), 
# ensure that data-sets can be analyzed in a variety of other VREVirtual Research Environment.'s for e.g. trend analysis, plotting, graphing and tabulation. 
The activities of NEAFC will focus on the requirements engineering, validation of results, and use of the services. NEAFC will cover the entire part of data-extraction through queries, and aggregation to a level that can be allowed to be shared under this use case. This will be implemented by parametrized queries or procedures on / in the NEAFC RDBMS infrastructure.
The activities of iMarine will be to supply and / or further develop under the oversight of iMarine partners; 1. A component that can forward parameters to NEAFC data extraction logic, and collect and return the result sets, 2. A component that allows NEAFC to register the extraction logic in such a extraction component, 3. A component that allows NEAFC to register users and use of the data extraction component and grant / revoke access to individual users and groups thereof, 4. A component that logs and alerts on the (ab)use of the data extraction component.
Both NEAFC and iMarine activities will be covered under the sub-contract.
In order to ensure a sustainable use and maintenance of the delivered software to NEAFC, NEAFC will have to contribute hardware and software (JAVA) skills, and oversee the development of the connector to their confidential data.
Priority to EA-CoPCommunity of Practice., and more specifically, NEAFC
The proposed solution meets the following iMarine Board priority setting criteria:
  • The target community are Marine scientists and policy makers in need of aggregate effort reports;
  • The Users will be ICES staff that require such information, for each Class NEAFC will manage the credentials;
    • Class 1 users; editors; will be allowed download of data in aggregated format, and be charged with ensuring confidentiality;
    • Class 2 users; reviewers; will be allowed download of PDF reports;
    • Class 3 users; consumers; will have access to some released PDF reports with
  • Potential for co-funding; A confidential work-flow from RDBMS database to open access is often required, and can be offered as a service in StatsCube. The VALID and DCF work-flows will require a very similar service;
  • Structural allocation of resources; The Sub-contract is mentioned in the DoW.
  • Referred in DoW; Yes.
  • Business Cases; BC1.
  • How does the proposed action generally support sustainability aspects; The action will result in a managed access to RDBMS and will open proprietary RDBMS systems in a confidential yet transparent manner.
  • How consistent it is with EC regulations/strategies (eg INSPIRE); To be discussed. NEAFC has a confidentiality requirement that will decide the design and scope of this effort.
  • JAVA component considerations
    • Re-usability –
    • Benefits -
    • Compatibility - ;
Parentage
Relation to NEAFC Software
This para is for NEAFC to fill
Relation to D4S technologies
The proposed solution will enrich the StatsCube with a data extraction tool, and re-use existing StatsCube services.
* This will be a ticket for a top level activity to make the progress traceable through TRAC.
* There can be more detailed tickets implementing more specific functionality. 
* Statistical service; ONLY if NEAFC agrees to allow advances aggregations on their source data, will the Statistical Service be a candidate, as it requires that data are physically present in the infrastructure
* SDMX Registry; Aggregated statistics can be stored in the iMarine SDMX Registry, if required;
* Statistical Service; will be offered to ICES and NEAFC staff to perform advanced analysis through data mining; 
* Tabular Data Manager;
* Tabular Data Flow Manager; 
* Time Series Manager;
* CLM; will be essential to produce (pdf)reports based on species names instead of codes;  
The proposed soluton will use existing services from e.g. GeosCube:
* GeoExplorer; Foreseen to offer vizualization over aggregate statistics
Finally, the tool will use several generic iMarine services:
* Workspace; Datasets in other formats can be stored in the secure workspace. 
Productivity
The proposed measures reduce search effort and improves reliability for ICES staff, and reduce support time to write / run ad-hoc queries and exchange the data for NEAFC Staff.
It thus does reduce the workload for both ICES and NEAFC staff.
Presentation
How must the component be delivered to users? (UI Design / on-line help / training material / support)
Policy
Are there any policies available that describe data access and sharing?
The NEAFC data are subject to a stringent confidentially policy, and should be disclosed to authorized ICES staff only, or when aggregated, be subject to a release work flow with authorization given by NEAFC.  
Have the Copyright / attribution / metadata / legal aspects been addressed from a user and technology perspective?
Work will have to be distributed amongst NEAFC and iMarine staff to ensure and evidence these aspects at validation time.