Difference between revisions of "ICES SGVMS"

From D4Science Wiki
Jump to: navigation, search
Line 1: Line 1:
 +
=== Hypothesis and Thesis ===
 +
The premise of this activity was a review of the ICES procedure for interpolating Vessels routes. A feasibility study was produced on the basis of the 2012 report by the Study Group on VMS data (SGVMS) on vessels data analysis.
  
=== Hypothesis ===
+
Our review is available at the following address: http://goo.gl/risQre
  
A review of the ICES procedure is described here: http://goo.gl/risQre
+
The scope of the SGVMS is to supply ICES expert groups with information and highlights. Interested groups would manage the following fields of research: bird ecology, marine mammal ecology, spatial planning, socio-economics. The products of the SGVMS analyses involve (i) spatially detailed maps of fishing effort by métier, (ii) trends in effort over time and (iii) identification of regions unimpacted by certain gears.
A review of the memory consumption produced showed a large usage: http://goo.gl/f3Y3je up to 20GB (100000 points)
+
Starting from this point, the scope of this experiment is to show what the i-Marine e-Infrastructure can add to the SGVMS procedures.
  
Using the iMarine infrastructure, users will see improvments in performance
+
* Which enhancements can bring importing SGVMS tools in the i-Marine e-Infrastructure?
Multi-user synchronous calls
+
* Which is the performance of the resulting process?
  
In addition, the management of
+
In this experiment we give an answer to the above questions.
* Sharing facilities
+
* Provenance management
+
  
=== Prediction ===
+
=== Outcome ===
  
 +
The results of this experiment highlight that there are advantages in integrating SGVMS tools in the e-Infrastructure. We demonstrate this with a practical example on Vessels points interpolation.
  
=== Experimentation ===
+
We can summarize these with the following points:
The experimental details are described here: https://issue.imarine.research-infrastructures.eu/ticket/2861
+
Start each line
 +
with a number sign (#).
 +
 
 +
# The e-Infrastructure enables multi-tenancy and synchronous interrogation to a standalone procedure with hardcoded inputs and outputs
 +
# A graphical user interface is automatically generated on top of the procedure
 +
# The e-Infrastructure allows for executing R scripts on powerful machines
 +
# The script can potentially be fed with datasets yet uploaded on the e-Infrastructure
 +
# The integration allows non-R programmers to use an R Script
 +
# The system enables automatic provenance management: the history of the experiments, the used inputs and the produced outputs are automatically recorded
 +
# The system allows for inputs, outputs and parameters sharing in easy way
 +
# Information is stored on Hi-availability, distributed storage systems
 +
# The procedure can be used by external people, if the process is allowed to be published under WPS standard. Connection is even possible by means of Java thin clients
 +
 
 +
=== Activity Workflow ===
 +
A logbook of the activity, from the requirements to the implementation can be found here: https://issue.imarine.research-infrastructures.eu/ticket/2861
  
 
=== Conclusion ===
 
=== Conclusion ===
  
 +
The weakness points of the sequential solution by SGVMS can be summarized with in following points:
 +
 +
# The SGVMS proposes several approaches to vessels tracks interpolation. Nevertheless, they state that these methods should be compared to each other and should be tested against a high resolution dataset. This would be useful to assess which of the methods most closely reflects reality. They expect that different methods might appear most suitable depending on gear or fleet;
 +
# The users of their platform should be able to (i) understand of the contents of the data being analyzed, (ii) work with a command-line interface environment, (iii) use adequate resources to ensure standardized but meaningful outputs;
 +
# The SGVMS reports and encourages also other approaches to VMS analysis, e.g. Bayesian models to investigate fishing patterns and models to understand the effect of resolution of VMS analysis on benthic impact assessments. On the other side, this requires cross-domain knowledge;
 +
# No mention to intersecting ecological models and VMS data is given;
 +
# The procedures cannot manage synchronous calls by different users, producing different outputs.
 +
 +
 +
I-Marine is endowed with a framework to import Vessels processing scripts, written in R language. Furthermore, it accommodates the above requirements by means of input\output standardization and e-Infrastructure facilities.
 +
These can be summarized in the following:
 +
 +
# separation between final users and developers of the process;
 +
# multi-tenancy an multi-user facilities;
 +
# resources sharing;
 +
# input\output datasets reusability;
 +
# applicability of models from other domains (Bayesian models);
 +
# intersection with models developed in other domains (e.g. Aquamaps).
 +
 +
=== Future Development ===
 +
Future development on top of the here presented integration can involve the following points:
 +
 +
# Porting of other SGVMS tools onto the e-Infrastructure;
 +
# Using i-Marine to extend the SGVMS tools, e.g. for producing FishFrame compliant documents;
 +
# Using i-Marine to practically interpolating large vessels tracks;
 +
# Using Time series analysis tools on vessels tracks;
 +
# Extracting backward and forward fishing indicators;
 +
# Distributing VMS related aggregated data products through the infrastructure;
 +
# Intersecting VMS/FishFrame data with niche models (Aquamaps) to retrieve the list of species possibly involved in catches;
 +
# Applying clustering to vessels tracks to detect similar behaviors by vessels;
 +
# Using Bayesian models to automatically classifying fishing activity.
 +
 +
=== Experimentation ===
 +
 +
A review of the memory consumption produced showed a large usage: http://goo.gl/f3Y3je up to 20GB (100000 points)
 +
 +
=== Related links ===
 +
[http://gcube.wiki.gcube-system.org/gcube/index.php/Statistical_Manager_Tutorial A tutorial on the Statistical Manager]
 +
 +
[http://gcube.wiki.gcube-system.org/gcube/index.php/How-to_Implement_Algorithms_for_the_Statistical_Manager Implementing algorithms with the Statistical Manager Framework]
 +
 +
[http://gcube.wiki.gcube-system.org/gcube/index.php/How_to_Interact_with_the_Statistical_Manager_by_client Using the Statistical Manager by external thin clients]
  
=== References and links ===
+
[https://i-marine.d4science.org/group/biodiversitylab/processing-tools The Statistical Manager interface on the i-Marine Portal (Biodiversity Lab VRE)]

Revision as of 01:05, 16 June 2014

Hypothesis and Thesis

The premise of this activity was a review of the ICES procedure for interpolating Vessels routes. A feasibility study was produced on the basis of the 2012 report by the Study Group on VMS data (SGVMS) on vessels data analysis.

Our review is available at the following address: http://goo.gl/risQre

The scope of the SGVMS is to supply ICES expert groups with information and highlights. Interested groups would manage the following fields of research: bird ecology, marine mammal ecology, spatial planning, socio-economics. The products of the SGVMS analyses involve (i) spatially detailed maps of fishing effort by métier, (ii) trends in effort over time and (iii) identification of regions unimpacted by certain gears. Starting from this point, the scope of this experiment is to show what the i-Marine e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large. can add to the SGVMS procedures.

  • Which enhancements can bring importing SGVMS tools in the i-Marine e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large.?
  • Which is the performance of the resulting process?

In this experiment we give an answer to the above questions.

Outcome

The results of this experiment highlight that there are advantages in integrating SGVMS tools in the e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large.. We demonstrate this with a practical example on Vessels points interpolation.

We can summarize these with the following points: Start each line with a number sign (#).

  1. The e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large. enables multi-tenancy and synchronous interrogation to a standalone procedure with hardcoded inputs and outputs
  2. A graphical user interface is automatically generated on top of the procedure
  3. The e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large. allows for executing R scripts on powerful machines
  4. The script can potentially be fed with datasets yet uploaded on the e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large.
  5. The integration allows non-R programmers to use an R Script
  6. The system enables automatic provenance management: the history of the experiments, the used inputs and the produced outputs are automatically recorded
  7. The system allows for inputs, outputs and parameters sharing in easy way
  8. Information is stored on Hi-availability, distributed storage systems
  9. The procedure can be used by external people, if the process is allowed to be published under WPS standard. Connection is even possible by means of Java thin clients

Activity Workflow

A logbook of the activity, from the requirements to the implementation can be found here: https://issue.imarine.research-infrastructures.eu/ticket/2861

Conclusion

The weakness points of the sequential solution by SGVMS can be summarized with in following points:

  1. The SGVMS proposes several approaches to vessels tracks interpolation. Nevertheless, they state that these methods should be compared to each other and should be tested against a high resolution dataset. This would be useful to assess which of the methods most closely reflects reality. They expect that different methods might appear most suitable depending on gear or fleet;
  2. The users of their platform should be able to (i) understand of the contents of the data being analyzed, (ii) work with a command-line interface environment, (iii) use adequate resources to ensure standardized but meaningful outputs;
  3. The SGVMS reports and encourages also other approaches to VMS analysis, e.g. Bayesian models to investigate fishing patterns and models to understand the effect of resolution of VMS analysis on benthic impact assessments. On the other side, this requires cross-domain knowledge;
  4. No mention to intersecting ecological models and VMS data is given;
  5. The procedures cannot manage synchronous calls by different users, producing different outputs.


I-Marine is endowed with a framework to import Vessels processing scripts, written in R language. Furthermore, it accommodates the above requirements by means of input\output standardization and e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large. facilities. These can be summarized in the following:

  1. separation between final users and developers of the process;
  2. multi-tenancy an multi-user facilities;
  3. resources sharing;
  4. input\output datasets reusability;
  5. applicability of models from other domains (Bayesian models);
  6. intersection with models developed in other domains (e.g. Aquamaps).

Future Development

Future development on top of the here presented integration can involve the following points:

  1. Porting of other SGVMS tools onto the e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large.;
  2. Using i-Marine to extend the SGVMS tools, e.g. for producing FishFrame compliant documents;
  3. Using i-Marine to practically interpolating large vessels tracks;
  4. Using Time series analysis tools on vessels tracks;
  5. Extracting backward and forward fishing indicators;
  6. Distributing VMS related aggregated data products through the infrastructure;
  7. Intersecting VMS/FishFrame data with niche models (Aquamaps) to retrieve the list of species possibly involved in catches;
  8. Applying clustering to vessels tracks to detect similar behaviors by vessels;
  9. Using Bayesian models to automatically classifying fishing activity.

Experimentation

A review of the memory consumption produced showed a large usage: http://goo.gl/f3Y3je up to 20GB (100000 points)

Related links

A tutorial on the Statistical Manager

Implementing algorithms with the Statistical Manager Framework

Using the Statistical Manager by external thin clients

The Statistical Manager interface on the i-Marine Portal (Biodiversity Lab VRE)