7.12.2012-TLO FLOD Ontological Analysis

From D4Science Wiki
Jump to: navigation, search

Agenda

Time: Fri, Dec 07, 2012, 12:00 - 13:40 CEST


1) Quick introduction of the TLO-activity methodology

  • Looking for any comments and feedback from all the participants

2) Motivation-Goal-Requirements of the TLO

  • Looking for any comments and feedback from all the participants

3) FLOD Ontological Analysis: First Iteration

  • Looking for any comments and feedback from all the participants, in particular from Claudio

4) Next steps

  • Looking for any comments and feedback from all the participants

5) Any related issue if needed

Participants

  • Carlo Allocca (FORTH)
  • Chryssoula Bekiari (FORTH)
  • Claudio Baldassarre (FIPS)
  • Pasquale Pagano (CNR Istituto di Scienza e Tecnologie)
  • Yannis Marketakis (FORTH)
  • Ellenbroek, Anton (FIPS)

Discussion Summary

1) Quick introduction of the TLO-activity methodology


2) Motivation-Goal-Requirements of the TLO

  • Generally, we agreed on the Motivation-Goal-Requirements of the TLO as described at http://wiki.i-marine.eu/index.php/Top_Level_Ontology#Motivation_-_Goal_-_Requirements.
  • To sup up:
    • MOT: Semantic Web Technologies, Applications and Services for biodiversity mostly rely on network of knowledge models interconnected at syntactical and semantically levels to provide a solid basis for the interoperability of the biodiversity systems.
    • GOAL: Our goal in modelling and formalising a TLO is for integrating and semantically extending the underlying models of the existing marine data sources such as FLOD and ECOSCOPE.
    • REQui: The main requirements that the TLO should satisfy are:
      • R1: Focusing on Ecosystem Approach to Fisheries and Marine Resources, TLO should be generic enough to provide consistent abstraction or specification of concepts included in all data models or ontologies of marine data sources
      • R2: Focusing on Ecosystem Approach to Fisheries and Marine Resources, TLO should be generic enough to provide the necessary properties to make this distributed knowledge bases (FLOD, ECOSCOPE and so on and so forth) a coherent source of facts, relating observation data to the respective space-temporal context and categorical domain knowledge.
  • Some Remarks:
    • Better clarify the use of the DwC and IBIS, adding the reference to all of them. In addition, instead of the line FLOD, Ecoscope, DwC, we should provide more details using a schema as such <Source, Format, Vocabularies>, e.i<FAO, FLOD-ontology , ASFIS, WoRMS> and <OBIS, DarwinCore, ISO3, WoRMS, CoL>


3) FLOD Ontological Analysis: First Iteration

  • We clarified why we need a such analysis
    • In developing a TLO which should satisfy R1 and R2, the first question that we would like to answer is: WHY FLOD does not cover the requirements above?; What do we need introduce to make FLOD compatible with the requirements above? In other words what are the concepts and properties that will complement the FLOD model towards the TLO design? To answer this questions, we had to do an ontological analysis of FLOD.
  • We pointed out why FLOD cannot be considered the favourite model
    • we discussed about the following competence query: "Give me all the Species of the Benguela Ecosystem". Generally, FLOD does not have a complete data structure needed to answer this type of query. Particularly, FLOD does not provide Species, Ecosystem as concepts, therefore, it does not have Benguela as instance of Ecosystem.
    • For this reason, we want to build a knowledge model, such as the TLO, that is generic and expressive enough to be able to support such a query.
  • Once, we have the TLO that satisfy R1 and R2, the next discussed question was
    • HOW does FLOD get benefits from the TLO? To answer this question, we discussed and agreed on an example of "Mapping Pattern" between FLOD and TLO based on the Fig. 8 in the document https://portal.i-marine.d4science.org/group/data-e-infrastructure-gateway/workspace?itemid=04d81d92-d3e3-41d9-aeaf-605bd5c4ea33. The main idea is to use TLO to classify appropriately - w. r. t. the domain of discourse - the instances of CodedEntity, CodedEntitySet, Code classes from FLOD. We also agreed on the fact that as FLOD is based on an modular approach and makes use of the ontology design pattern (shown in Fig 10 in the document https://portal.i-marine.d4science.org/group/data-e-infrastructure-gateway/workspace?itemid=04d81d92-d3e3-41d9-aeaf-605bd5c4ea33), we could use similar mapping patterns to semantically extend other part of the FLOD model (such as gear, vessel, port, gearCode, VesselCode, PortCode) using the TLO.
    • Related to the integration of FLOD and ECOSCOPE, we also discussed about the problem of co-reference resolution across the several marine data sources. We all agreed that this is an issues for long term future work as the TLO is looking at the integration of FLOD and ECOSCOPE at "Schema level", and not at "Instance Level", which in our case could be considered as requirement to have also an integration at instance level.


4) Next steps

  • Based on the particular discussed and agreed example of the mapping pattern, we need to make development progress of the TLO in order to implement and test it.
  • Anton's comment: My prime scenarios for the exploitation of the TLO would
    • 2a. Support the compilation of Species Fact Sheets, including links to related content, such as maps, food-webs, and images
    • 2b. Exploit semantic data through ontologies that contain taxonomic data (E.g. DwC structure as an ontology)
    • 2c. Use semantic tools to find data from different sources, and 'do' something; make a map, discover a trend, project a species abundance (E.g. how many pinguins are eaten by polar bears. 10 seconds for the correct answer!)
      • The reason why I gave these (but please discuss with Julien / Claudio and the semantic cluster), is that the definition of the TLO should also look at the expected exploitation. You now seem to be modeling the domain itself, but that might be too limited. Can the TLO support each scenario?
      • I believe the semantic fact sheets can be produced in this project, as we have all partners needed. For b and c I believe the TLO has to be much more versatile, e.g. what happens if I call a species Sarda sarda, while you call it Sarda (Sardensis) sarda, or if your definition of a term in a DwC element is different from mine (E.g. the French definition for "Overwintering" is different from the English, whose is different from the German etc. So if you need to define a protection for "Overwintering" species, you must rely on access to the exact definition)
  • Pasquale's comment on the Anton's comment :
    • This is a concrete problem we are facing continuously. We provided access to several sources now but it does not solve the main questions Anton are reporting exactly for the issue reported above. This is why we are posing many expectations on the results of the Semantic cluster.


5) Any related issue if needed

  • None.

Actions

  • Based on the particular discussed and agreed example of the mapping pattern, we need to make development progress of the TLO in order to implement and test it.
  • Working on the ECOSCOPE ontological analysis.
  • Discuss the Anton and Pasquale's comments with the rest of the Semantic Cluster and consider them (to some extend) already for the next TLO design.