Difference between revisions of "Blue Hackathon iMarine Data Challenges"

From D4Science Wiki
Jump to: navigation, search
(Challenge #5)
m (Challenge #3)
 
(83 intermediate revisions by 6 users not shown)
Line 3: Line 3:
 
Enrich HTML web content with RDF annotation, and enable annotation-based document discovery  
 
Enrich HTML web content with RDF annotation, and enable annotation-based document discovery  
 
=====Background=====
 
=====Background=====
(Why this is relevant to blue-er world)
+
Most information resources in the "Blue" domain were created without the exploitation by advanced search and discovery mechanisms in mind. They thus lack the semantic richness that would improve their visibility, usefulness, and quality.  
=====Objectives=====
+
#We ask the hackathon participants to find a technical solution to enrich the factsheets of the FIGIS portal with annotations in RDFa format. Can we write there? The annotation will consist at least of the URIs of the entities referenced in the factsheet, and of set of relevant relations provided with the datasets.
+
##GOAL: an RDFa client will be able to extract (is that the correct term?) the annotations
+
#We ask the hackathon to use the annotations produced at item one, as input to online search of factsheets (publication, GIS maps, images, statistical timeseries), to create enhanced discovery facility that complement the web page information content.
+
##GOAL: a set of factsheets is retrieved via online search services.
+
##GOAL: Access to the factsheets
+
=====Challenges=====
+
TBD
+
=====Datasets=====
+
TBD
+
=====APIs=====
+
TBD
+
  
 +
One cost-effective opportunity to overcome this limitation may be the addition of RDFa to existing datasets. This can be achieved by a mechanism that extracts concepts from a html-text, aligns these with concepts from a semantic KB, and returns the uri's that can be attached to the source, either off-line, as header metadata, or in-line.
  
===Challenge #2===
+
Proving that such a mechanisms can effectively enrich a 'flat' resource with interpretable rdf will present evidence for data owners in the "Blue" domain that they can add value to their resources with limited costs with the help of semantechnicians.
Generate RDF dataset from GIS layer in geonetwork, and map the geographic entities with existing LOD datasets
+
=====Background=====
+
(Why this is relevant to blue-er world)
+
=====Objectives=====
+
#We ask the hackathon to find a technical solution to produce LOD dataset from a collection of GIS layers accessed via GeoNetwork web services. The entities of the dataset will have to be mapped with existing LOD datasets in the GIS domain.
+
##GOAL: given an online service that list a collection of GIS layer, a new LOD dataset is produced.
+
##GOAL: enrich the geographic entities in that dataset with more data gathered trough the mapping with  existing LOD GIS datasets (e.g. geonames, geopolitical ontology, dbpedia, etc).
+
=====Challenges=====
+
TBD
+
=====Datasets=====
+
TBD
+
=====APIs=====
+
TBD
+
  
 
===Challenge #3===
 
Generate RDF dataset from DarwinCore sources and map to existing biodiversity LOD
 
=====Background=====
 
(Why this is relevant to blue-er world)
 
 
=====Objectives=====
 
=====Objectives=====
#We ask the hackathon participants to produce a LOD dataset from a source of DarwinCore data (XML or a service), and map its entities with existing LOD datasets in the biodiversity domain.
+
We ask the hackathon participants to find a technical solution to enrich the [http://www.fao.org/fishery/species/search/en factsheets] of the FIGIS portal with annotations in RDFa format. The annotation will consist at least the URIs of the entities referenced in the factsheet, and of set of relevant relations provided with the datasets.  
##GOAL: access  complementary information with taxonomic data through the mappings (e.g. species conservation status, capture statistics, distribution map, etc)
+
=====Challenges=====
+
TBD
+
=====Datasets=====
+
TBD
+
=====APIs=====
+
TBD
+
  
 +
We ask the hackathon to:
 +
#GOAL: Provide an RDFa client to
 +
## extract concepts from fact-sheets, e.g. accessing the fact-sheet content using the service provided [http://figisapps.fao.org/vrmf/samples/species/FS/ here]
 +
## identify uri's from several KB's,
 +
## create the RDF annotations, and
 +
## expose these RDF annotations.
 +
#GOAL: Use the annotations produced at item one, as input to online search of factsheets (publication, GIS maps, images, statistical timeseries), to create enhanced discovery facility that complement the web page information content.
 +
#GOAL: Retrieve a set of fact-sheets via online search services.
 +
#GOAL: Write RDFa to these factsheets.
  
===Challenge #4===
 
Generate dynamic fact-sheets mashing up data from distributed LOD datasets
 
=====Background=====
 
(Why this is relevant to blue-er world)
 
=====Objectives=====
 
#We ask the hackathon to find a technical solution based on LOD data mashup, to compose domain-based sections of a factsheet, taking data from distributed LOD datasets. The domain of the sections can be: economics, taxonomic, fishing technique, statistics, publications etc.
 
##GOAL: a web service responding with a collection of data clustered by domain-section, and display the result in HTML format
 
 
=====Challenges=====
 
=====Challenges=====
 
TBD
 
TBD
 
=====Datasets=====
 
=====Datasets=====
TBD
+
* FAO Species
 +
** [[#Aquatic_Species_Fact_Sheets | Fact Sheets]]
 +
* KB's:
 +
** [[#TLO_based_SPARQL_endpoint | TLO]]
 +
** [[#FAO_FLOD | FLOD]]
 +
 
 
=====APIs=====
 
=====APIs=====
TBD
+
See below.
  
  
===Challenge #5===
+
===Challenge #2===
 
Search Results presentation exploitation.
 
Search Results presentation exploitation.
  
Line 76: Line 47:
 
TBD
 
TBD
 
=====Datasets=====
 
=====Datasets=====
TBD
+
[[#Ecoscope | Ecoscope]]
 +
 
 
=====APIs=====
 
=====APIs=====
gCUbe Search client
 
Wiki
 
* https://gcube.wiki.gcube-system.org/gcube/index.php/Search_2_Framework_(NEW)
 
  
Javadoc
+
[[#gCube_Search_client | gCube Search client]]
* http://www.gcube-system.org/javadocs/2-14-0/search-client-library_1.0.1-2.14.0/
+
  
===Challenge #6===
+
===Challenge #3===
Visualization and processing of data sets
+
Processing and Visualization of data sets
 +
 
 +
Exploit geolocation of real-world data in order to calculate and visualize geographical information and trends (i.e. migration of species). Support interactive map search over multiple sources, combined and enriched results. Search results will be presented on a map with possible options of clustering, filtering etc. User could also interact with results, like clicking on a result or location would show related results, helpful things etc.
 
=====Background=====
 
=====Background=====
 
(Why this is relevant to blue-er world)
 
(Why this is relevant to blue-er world)
 
=====Objectives=====
 
=====Objectives=====
#Exploit the species occurrences data in order to calculate and visualize geographical trends (i.e. migration of species). The data may be provided by a GeoServer.
+
#Exploit the species occurrences data in order to calculate and visualize geographical trends (i.e. migration of species).
 
#Interactive Map Search. Search over data of multiple source, combine them and enrich results. Search results can be presented on a map.
 
#Interactive Map Search. Search over data of multiple source, combine them and enrich results. Search results can be presented on a map.
 
##clustering, filtering
 
##clustering, filtering
Line 98: Line 68:
 
TBD
 
TBD
 
=====Datasets=====
 
=====Datasets=====
TBD
+
[[#iMarine_GeoNetwork | iMarine GeoNetwork]]
 +
 
 
=====APIs=====
 
=====APIs=====
TBD
+
[[#GeoNetwork_Client | iMarine GeoNetwork Client]]
 
+
 
+
===Some Notes from FORTH about possible challenges===
+
=====What is this fish? Shoot and learn=====
+
Title:  What is this fish? Shoot and learn
+
The user takes a shot of a fish using his mobile. The application uses the images of species (e.g. by exploiting ECOSCOPE’s images) and returns to the user the name of the fish and related information.
+
Requirement: image similarity.
+
=====Gradual Query Expansion for Species (semantic pre-processing of keyword queries)=====
+
Title: Gradual Query Expansion for Species (semantic pre-processing of keyword queries)
+
Challenge: tackle the problem of empty (or small) answers in search systems by designing and developing a component that allows gradual query expansion which exploits the availability of linked data.
+
Input: A species name,  a number of related sources of information
+
Output:
+
A series of queries <q1, q2, … , qk>, where a query is a set of words.  The words in q_i is subset of the words in q_{i+1} , and so on.
+
For instance q1 could contain the names of the species in different natural languages, q2 could include the scientific names, q3 could include sub/sup-species, q4 could include  competitors, predators, etc.
+
It could be deployed as a web app where the user enters his query, the app computes the expanded queries and could directly forward the control to a search engine (the expanded query is passed through the url).
+
 
+
=====Linked Data for Species=====
+
Title:  Linked Data for Species
+
Link the species described in the TLO-based warehouse (SPARQL endpoint) with related information in other sources of structured (e.g. DBPEDIA, ..) or unstructured information (e.g. Wikipedia, …) aiming at ….<<we need a specific objective here>>
+
 
+
=====Linked Data Browser=====
+
Title:  Linked Data Browser
+
Build a  browser (textual/graphical) for the TLO-based repository (SPARQL endpoint). You should consider devices with small screens (one could develop a dedicated android client for this).  Other SPARQL endpoints (or structured information accessed through HTTP) could also be considered. Challenge: tackle overloading
+
  
 
==Datasets and APIs==
 
==Datasets and APIs==
Line 132: Line 80:
 
Data Graph
 
Data Graph
 
*  http://www.ics.forth.gr/isl/TLObasedDataWarehouse
 
*  http://www.ics.forth.gr/isl/TLObasedDataWarehouse
 +
=====Description=====
 +
 +
The  description of the MarineTLO can be found here:
 +
 +
http://wiki.i-marine.eu/index.php/Top_Level_Ontology
 +
 +
=====Exploitation Example=====
 +
(How can be used within a challenge)
  
 
====FAO FLOD  ====
 
====FAO FLOD  ====
 
* http://www.fao.org/figis/flod/
 
* http://www.fao.org/figis/flod/
  
==== iMarine ICIS TS service ====
+
===FLOD SPARQL endpoint ===
Endpoint of the WS giving access to ICIS data
+
* http://www.fao.org/figis/flod/endpoint/flod (Jena Joseki)
* http://node28.p.d4science.research-infrastructures.eu:8080/wsrf/services/gcube/contentmanagement/timeseriesservice/timeseries/TimeSeriesFactory
+
 
 +
=====Description=====
 +
 
 +
The Fisheries Linked Open Data (FLOD) stems from a rising trend initiative known as Linked Open Data. It is dedicated to create a dense network of relationships among the entities of the Fishery domains, and to programmatically serve them to semantic and traditional application environments.
 +
It started with the objective to identify and interlink equivalent codes from different code lists in use by FIGIS, in order to consolidate the information referenced by each different code, and then expanded to include external data source such as NAFO, EU, and ICCAT.
 +
Currently the FLOD network includes entities and relationships from the the domains of Marine Species, Water Areas, Land Areas, Exclusive Economic Zones. It serves software applications in the domain of statistics, and GIS.
 +
The FLOD content is exposed via either SPARQL endpoints (suitable for semantic applications), or via JAVA API to be embedded in consumers' application code.
 +
 
 +
=====Exploitation Example=====
 +
Query for entities of kind:
 +
 
 +
* Gear types
 +
* Vessel types
 +
* Marine species
 +
* Fishing Areas
 +
* Statistical countries (Flagstate)
 +
* Regional Fisheries Bodies
 +
 
 +
==== Ecoscope ====
 +
 
 +
http://www.ecoscopebc.ird.fr
 +
 
 +
===== Description =====
 +
Knowledge base on Exploited Marine Ecosystems, the repository gives access to a series of information related  species, fishing vessels, agents, information resources ( images, databases, spatial information, publication and plots)
 +
 
 +
The access to the information can be trough SPARQL or Opensearch
 +
 
 +
SPARQL Endpoint:
 +
 
 +
* http://ecoscopebc.mpl.ird.fr/joseki/ecoscope.html
 +
 
 +
Opensearch description document
 +
 
 +
* http://d4science.web.cern.ch/d4science/OpenSearch/OpenSearchRSS.xml
 +
 
 +
==== Genesi DEC ====
 +
 
 +
http://www.genesi-dec.eu/
 +
 
 +
===== Description =====
 +
 
 +
The Genesi DEC project established establish open data and services access, allowing European and worldwide Digital Earth Communities to seamlessly access, produce and share data, information, products and knowledge. This creates a multi-dimensional, multi-temporal, and multi-layer information facility of huge value in addressing global challenges such as biodiversity, climate change, pollution and economic development.
 +
 
 +
http://www.genesi-dec.eu/search/
 +
 
 +
===== Exeploitation Example =====
  
 
==== iMarine GeoNetwork ====
 
==== iMarine GeoNetwork ====
 
* http://geonetwork.d4science.org/geonetwork/
 
* http://geonetwork.d4science.org/geonetwork/
 +
=====Description=====
 +
The iMarine Geonetowrk service is the entry point for tbe discovery and access to many type of Georeference data for the marine field.
 +
The service is equipped with a cluster of Geoservers and Thredds services which physically host the data. In particular the following data can be queried and retrieved:
 +
 +
* AquaMaps distribution Maps (http://aquamaps.org)
 +
* FAO GeoNetwork map (http://www.fao.org/geonetwork)
 +
* MyOcean Environemental data (http://www.myocean.eu/)
 +
* WordClim global climate layers (http://www.worldclim.org/)
 +
 +
=====Exploitation Example=====
 +
 +
The Geonetwork service can be used in order to retrieve GIS information for a given marine species. Data can be accessed trough standard protocol as WMS and WFS
  
 
==== iMarine Biodiversity Data Service ====
 
==== iMarine Biodiversity Data Service ====
Endpoint of the WS giving access to Biodiversity data coming from several providers (  OBIS, GBIF, CoL..)
+
 
* http://node82.p.d4science.research-infrastructures.eu:8080/wsrf/services/gcube/data/speciesproductsdiscovery/manager
+
=====Description=====
 +
 
 +
The Species Product Disvocery WS giving access to Biodiversity data coming from several providers (  OBIS, GBIF, CoL..)
 +
 
 +
The client API discovers automatically the endpoint of the service from the iMarine Information System.
 +
 
 +
=====Exploitation Example=====
 +
 
 +
The service can be used to retrieve Occurrence points and Taxon information coming from the available Data Providers for a given marine species. Data can be extracted in csv and DwC-A format.
 +
 
 +
==== Aquatic Species Fact Sheets ====
 +
 
 +
=====Description=====
 +
 
 +
Aquatic Species Fact Sheets provided by FAO.
 +
 
 +
* http://www.fao.org/fishery/species/search/en
 +
 
 +
VRMF species factsheet data extraction API
 +
* http://figisapps.fao.org/vrmf/samples/species/FS/
 +
 
 +
=====Exploitation Example=====
 +
 
 +
This service can be used to extract aquatic species FactSheets either in json or csv format.
  
 
===APIs===
 
===APIs===
 
==== SPARQL  Client ====
 
==== SPARQL  Client ====
 
Any SPARQL client available on the Web
 
Any SPARQL client available on the Web
 
==== TS Client ====
 
The Time Series client wiki
 
* https://gcube.wiki.gcube-system.org/gcube/index.php/Time_Series_Management
 
 
The Javadoc API of the TS client
 
?
 
  
 
==== GeoNetwork Client ====
 
==== GeoNetwork Client ====
 +
 +
=====Description=====
 
Wiki
 
Wiki
 
* https://gcube.wiki.gcube-system.org/gcube/index.php/GeoNetwork_library
 
* https://gcube.wiki.gcube-system.org/gcube/index.php/GeoNetwork_library
 +
 +
A library to interact with GeoNetwork's REST Interface to publish/modify/delete and search for Metadata.The library is designed on top of geoserver-manager library, developed by GeoSolutions. Metadata objects managed by the library are compliant to standard specification ISO 19115:2003/19139.
 +
 +
=====Exploitation Example=====
  
 
Javadoc
 
Javadoc
Line 166: Line 201:
  
 
==== SPD Client ====
 
==== SPD Client ====
TBD
 
  
==== gCUbe Search client ====
+
 
 +
=====Description=====
 +
Wiki
 +
* https://gcube.wiki.gcube-system.org/gcube/index.php/Species_Product_Discovery:_client_library
 +
 
 +
The SPD Client can be used to access a Biodiversity data broker implemented in iMarine, the SPD service. More details about the architecture of the service are available at
 +
 
 +
https://gcube.wiki.gcube-system.org/gcube/index.php/Biodiversity_Access
 +
 
 +
=====Exploitation Example=====
 +
 
 +
The client can be used for example to query the OBIS data source and return the taxonomic information related to '''shark'''
 +
 
 +
<pre>
 +
 
 +
ScopeProvider.instance.set("/d4science.research-infrastructures.eu/gCubeApps");
 +
Manager manager = manager().withTimeout(3, TimeUnit.MINUTES).build();
 +
 
 +
Stream<ResultElement> taxa = manager.search("SEARCH BY CN 'shark' RESOLVE WITH OBIS EXPAND IN OBIS  RETURN Taxon");
 +
 +
while (taxa.hasNext()){
 +
TaxonomyItem taxon = (TaxonomyItem)taxa.next();
 +
System.out.println(taxon.getAuthor()+" "+taxon.getRank()+" "+taxon.getScientificName());
 +
while ((taxon=taxon.getParent())!=null)
 +
System.out.println(taxon.getScientificName()+" -- "+taxon.getRank());
 +
}
 +
 
 +
</pre>
 +
 
 +
==== gCube Search client ====
 +
 
 +
=====Description=====
 
Wiki
 
Wiki
 
* https://gcube.wiki.gcube-system.org/gcube/index.php/Search_2_Framework_(NEW)
 
* https://gcube.wiki.gcube-system.org/gcube/index.php/Search_2_Framework_(NEW)
 
+
=====Exploitation Example=====
 
Javadoc  
 
Javadoc  
 
* http://www.gcube-system.org/javadocs/2-14-0/search-client-library_1.0.1-2.14.0/
 
* http://www.gcube-system.org/javadocs/2-14-0/search-client-library_1.0.1-2.14.0/
 +
 +
=== Artifacts ===
 +
 +
The software distributed by iMarine ( gCube ) is available trough Maven repositories. The following setting.xml configuration file should be set up:
 +
 +
<pre>
 +
<settings xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 +
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">
 +
 +
 +
 +
<profiles>
 +
<profile>
 +
<id>gcube</id>
 +
<repositories>
 +
<repository>
 +
<id>gcube-releases</id>
 +
<name>gCube Releases</name>
 +
<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-releases</url>
 +
<releases>
 +
<enabled>true</enabled>
 +
</releases>
 +
<snapshots>
 +
<enabled>false</enabled>
 +
</snapshots>
 +
</repository>
 +
<repository>
 +
<id>gcube-externals</id>
 +
<name>gCube Externals</name>
 +
<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-externals</url>
 +
<snapshots>
 +
<enabled>false</enabled>
 +
</snapshots>
 +
<releases>
 +
<enabled>true</enabled>
 +
</releases>
 +
</repository>
 +
</repositories>
 +
 +
<pluginRepositories>
 +
<pluginRepository>
 +
<id>gcube-releases</id>
 +
<name>gCube Releases</name>
 +
<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-releases</url>
 +
<releases>
 +
<enabled>true</enabled>
 +
</releases>
 +
<snapshots>
 +
<enabled>false</enabled>
 +
</snapshots>
 +
</pluginRepository>
 +
<pluginRepository>
 +
<id>gcube-externals</id>
 +
<name>gCube Externals</name>
 +
<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-externals</url>
 +
<snapshots>
 +
<enabled>false</enabled>
 +
</snapshots>
 +
<releases>
 +
<enabled>true</enabled>
 +
</releases>
 +
</pluginRepository>
 +
</pluginRepositories>
 +
 +
</profile>
 +
</profiles>
 +
 +
<activeProfiles>
 +
<activeProfile>gcube</activeProfile>
 +
</activeProfiles>
 +
</settings>
 +
</pre>
 +
 +
or the  same settings included in your pom file. The maven coordinates of the components to use for the challenges are documented in the related wikis.
  
 
==External Links==
 
==External Links==
 
* [http://wiki.agroknow.gr/agroknow/index.php/BlueHackathon2013 Blue Hackathon Event Home Page]
 
* [http://wiki.agroknow.gr/agroknow/index.php/BlueHackathon2013 Blue Hackathon Event Home Page]
 
* [http://wiki.agroknow.gr/agroknow/index.php/BlueHackathon2013-datasets BlueHackathon2013-datasets]
 
* [http://wiki.agroknow.gr/agroknow/index.php/BlueHackathon2013-datasets BlueHackathon2013-datasets]

Latest revision as of 10:10, 5 July 2013

Data Challenges

Challenge #1

Enrich HTML web content with RDF annotation, and enable annotation-based document discovery

Background

Most information resources in the "Blue" domain were created without the exploitation by advanced search and discovery mechanisms in mind. They thus lack the semantic richness that would improve their visibility, usefulness, and quality.

One cost-effective opportunity to overcome this limitation may be the addition of RDFa to existing datasets. This can be achieved by a mechanism that extracts concepts from a html-text, aligns these with concepts from a semantic KB, and returns the uri's that can be attached to the source, either off-line, as header metadata, or in-line.

Proving that such a mechanisms can effectively enrich a 'flat' resource with interpretable rdf will present evidence for data owners in the "Blue" domain that they can add value to their resources with limited costs with the help of semantechnicians.

Objectives

We ask the hackathon participants to find a technical solution to enrich the factsheets of the FIGIS portal with annotations in RDFa format. The annotation will consist at least the URIs of the entities referenced in the factsheet, and of set of relevant relations provided with the datasets.

We ask the hackathon to:

  1. GOAL: Provide an RDFa client to
    1. extract concepts from fact-sheets, e.g. accessing the fact-sheet content using the service provided here
    2. identify uri's from several KB's,
    3. create the RDF annotations, and
    4. expose these RDF annotations.
  2. GOAL: Use the annotations produced at item one, as input to online search of factsheets (publication, GIS maps, images, statistical timeseries), to create enhanced discovery facility that complement the web page information content.
  3. GOAL: Retrieve a set of fact-sheets via online search services.
  4. GOAL: Write RDFa to these factsheets.
Challenges

TBD

Datasets
APIs

See below.


Challenge #2

Search Results presentation exploitation.

Search results regarding marine data could be enriched in order to provide advanced experience to the user. Derived information could be injected into results regarding identification of special keywords (related to the query), with results retrieved by OpenSearch and other external(?) datasources. Also exploration of the results could be improved from simple browsing into information discovery, providing accumulated information, filtering, suggestions etc.

Background

(Why this is relevant to blue-er world)

Objectives
  1. We ask the hackathon participants to enrich the search results retrieved from iMarine Collections by identifying special keywords (related to the topic) with results retrieved from OpenSearch and other external(?) datasources.
  2. We ask the hackathon participants to explore the database by performing a number of predefined queries and keep statistics on them in order to enhance the existing browsing methods
Challenges

TBD

Datasets

Ecoscope

APIs

gCube Search client

Challenge #3

Processing and Visualization of data sets

Exploit geolocation of real-world data in order to calculate and visualize geographical information and trends (i.e. migration of species). Support interactive map search over multiple sources, combined and enriched results. Search results will be presented on a map with possible options of clustering, filtering etc. User could also interact with results, like clicking on a result or location would show related results, helpful things etc.

Background

(Why this is relevant to blue-er world)

Objectives
  1. Exploit the species occurrences data in order to calculate and visualize geographical trends (i.e. migration of species).
  2. Interactive Map Search. Search over data of multiple source, combine them and enrich results. Search results can be presented on a map.
    1. clustering, filtering
    2. trend identification
    3. interact with results, like clicking on a result or location would show related results, helpful things etc
Challenges

TBD

Datasets

iMarine GeoNetwork

APIs

iMarine GeoNetwork Client

Datasets and APIs

Datasets

TLO based SPARQL endpoint

Data Graph

Description

The description of the MarineTLO can be found here:

http://wiki.i-marine.eu/index.php/Top_Level_Ontology

Exploitation Example

(How can be used within a challenge)

FAO FLOD

FLOD SPARQL endpoint

Description

The Fisheries Linked Open Data (FLOD) stems from a rising trend initiative known as Linked Open Data. It is dedicated to create a dense network of relationships among the entities of the Fishery domains, and to programmatically serve them to semantic and traditional application environments. It started with the objective to identify and interlink equivalent codes from different code lists in use by FIGIS, in order to consolidate the information referenced by each different code, and then expanded to include external data source such as NAFO, EU, and ICCAT. Currently the FLOD network includes entities and relationships from the the domains of Marine Species, Water Areas, Land Areas, Exclusive Economic Zones. It serves software applications in the domain of statistics, and GIS. The FLOD content is exposed via either SPARQL endpoints (suitable for semantic applications), or via JAVA API to be embedded in consumers' application code.

Exploitation Example

Query for entities of kind:

  • Gear types
  • Vessel types
  • Marine species
  • Fishing Areas
  • Statistical countries (Flagstate)
  • Regional Fisheries Bodies

Ecoscope

http://www.ecoscopebc.ird.fr

Description

Knowledge base on Exploited Marine Ecosystems, the repository gives access to a series of information related species, fishing vessels, agents, information resources ( images, databases, spatial information, publication and plots)

The access to the information can be trough SPARQL or Opensearch

SPARQL Endpoint:

Opensearch description document

Genesi DEC

http://www.genesi-dec.eu/

Description

The Genesi DEC project established establish open data and services access, allowing European and worldwide Digital Earth Communities to seamlessly access, produce and share data, information, products and knowledge. This creates a multi-dimensional, multi-temporal, and multi-layer information facility of huge value in addressing global challenges such as biodiversity, climate change, pollution and economic development.

http://www.genesi-dec.eu/search/

Exeploitation Example

iMarine GeoNetwork

Description

The iMarine Geonetowrk service is the entry point for tbe discovery and access to many type of Georeference data for the marine field. The service is equipped with a cluster of Geoservers and Thredds services which physically host the data. In particular the following data can be queried and retrieved:

Exploitation Example

The Geonetwork service can be used in order to retrieve GIS information for a given marine species. Data can be accessed trough standard protocol as WMSSee Workload Management System or Web Mapping Service. and WFSWeb Feature Service

iMarine Biodiversity Data Service

Description

The Species Product Disvocery WS giving access to Biodiversity data coming from several providers ( OBIS, GBIF, CoL..)

The client API discovers automatically the endpoint of the service from the iMarine Information System.

Exploitation Example

The service can be used to retrieve Occurrence points and Taxon information coming from the available Data Providers for a given marine species. Data can be extracted in csv and DwC-A format.

Aquatic Species Fact Sheets

Description

Aquatic Species Fact Sheets provided by FAO.

VRMF species factsheet data extraction API

Exploitation Example

This service can be used to extract aquatic species FactSheets either in json or csv format.

APIs

SPARQL Client

Any SPARQL client available on the Web

GeoNetwork Client

Description

Wiki

A library to interact with GeoNetwork's REST Interface to publish/modify/delete and search for Metadata.The library is designed on top of geoserver-manager library, developed by GeoSolutions. Metadata objects managed by the library are compliant to standard specification ISO 19115:2003/19139.

Exploitation Example

Javadoc

SPD Client

Description

Wiki

The SPD Client can be used to access a Biodiversity data broker implemented in iMarine, the SPD service. More details about the architecture of the service are available at

https://gcube.wiki.gcube-system.org/gcube/index.php/Biodiversity_Access

Exploitation Example

The client can be used for example to query the OBIS data source and return the taxonomic information related to shark


ScopeProvider.instance.set("/d4science.research-infrastructures.eu/gCubeApps");
Manager manager = manager().withTimeout(3, TimeUnit.MINUTES).build();

Stream<ResultElement> taxa = manager.search("SEARCH BY CN 'shark' RESOLVE WITH OBIS EXPAND IN OBIS  RETURN Taxon");
		
while (taxa.hasNext()){
	TaxonomyItem taxon = (TaxonomyItem)taxa.next();
	System.out.println(taxon.getAuthor()+" "+taxon.getRank()+" "+taxon.getScientificName());
	while ((taxon=taxon.getParent())!=null)
		System.out.println(taxon.getScientificName()+" -- "+taxon.getRank());
}

gCube Search client

Description

Wiki

Exploitation Example

Javadoc

Artifacts

The software distributed by iMarine ( gCube ) is available trough Maven repositories. The following setting.xml configuration file should be set up:

<settings xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
	xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/settings-1.0.0.xsd">

	

	<profiles>
		<profile>
			<id>gcube</id>
			<repositories>
				<repository>
					<id>gcube-releases</id>
					<name>gCube Releases</name>
					<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-releases</url>
					<releases>
						<enabled>true</enabled>
					</releases>
					<snapshots>
						<enabled>false</enabled>
					</snapshots>
				</repository>
				<repository>
					<id>gcube-externals</id>
					<name>gCube Externals</name>
					<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-externals</url>
					<snapshots>
						<enabled>false</enabled>
					</snapshots>
					<releases>
						<enabled>true</enabled>
					</releases>
				</repository>
			</repositories>

			<pluginRepositories>
				<pluginRepository>
					<id>gcube-releases</id>
					<name>gCube Releases</name>
					<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-releases</url>
					<releases>
						<enabled>true</enabled>
					</releases>
					<snapshots>
						<enabled>false</enabled>
					</snapshots>
				</pluginRepository>
				<pluginRepository>
					<id>gcube-externals</id>
					<name>gCube Externals</name>
					<url>http://maven.research-infrastructures.eu/nexus/content/repositories/gcube-externals</url>
					<snapshots>
						<enabled>false</enabled>
					</snapshots>
					<releases>
						<enabled>true</enabled>
					</releases>
				</pluginRepository>
			</pluginRepositories>
			
		</profile>
	</profiles>

	<activeProfiles>
		<activeProfile>gcube</activeProfile>
	</activeProfiles>
</settings>

or the same settings included in your pom file. The maven coordinates of the components to use for the challenges are documented in the related wikis.

External Links