Top Level Ontology
Latest (June 2013) slide presentation
Person responsible for editing/maintaining this page
- Carlo Allocca
- e-mail: carlo@ics.forth.gr
TLO-Development activity
General Description
This activity concerns with the development of a top level ontology (called MarineTLO) that will integrate the concepts currently existing in marine-domain knowledge bases (in particular FLOD and ECOSCOPE knowledge bases). The MarineTLO-development activity is dived into six sub-activities (or Tasks) and related to each other as shown in the diagram in Fig 1.
Methodology
The methodology is based on an Iterative and Incremental development approach. As such, one iteration will involve all the tasks of the Fig 1. that are described @ http://wiki.i-marine.eu/index.php/Top_Level_Ontology. All the iterations will be accurately described and evolution of the MarineTLO ontology will be released in each iteration, based on the acquirement of the specific marine domain knowledge.
Activities scheduled with deadlines
Each iteration is planned to be monitored by opening related tickets.
Related Cluster
http://wiki.i-marine.eu/index.php/Semantic_cluster_achievements
Related Wiki Pages
http://wiki.i-marine.eu/index.php/XSearch
Motivation - Goal - Requirements
At very abstract level we use the picture below as preview that focuses on viewpoints for motivations, goal and requirements of having such a ontology on top of marine-domain knowledge bases.
Motivation. Semantic technologies, applications and services for biodiversity mostly rely on the rise of an interconnected and shared tree-of-life like dataset scaling on the web. The various communities (including also marine one) are contributing to this joint effort aim to share domain data and their meaning, to provide a solid basis for biodiversity systems interoperability. One of the challenges in iMarine is how users could experice a coherent source of fact about marine resources, rather than a bag of contributed contents.
Goal. The goal in modelling and formalising MarineTLO ontology is for integrating and semantically extending the underlying models of existing marine data sources. Specifically, the MarineTLO is used on the top of a number of real and heterogeneous marine data sources, including FLOD and ECOSCOPE, as knowledge mediator to represent, manipulate and reason upon and across them.
Requirements. Focusing on Ecosystem Approach to Fisheries and Marine Resources, the MarineTLO ontology should be generic enough
- to provide consistent abstraction or specification of concepts included in the data models or ontologies of marine data sources
- to provide the necessary properties to make this distributed knowledge bases (FLOD, ECOSCOPE and others) a coherent source of facts, relating observation data to the respective space-temporal context and categorical domain knowledge
Related Meetings (and their slides)
TCOM in Italy, Rome, 03.11.2012
FLOD Ontological Analysis, 7.12.2012, meeting online
- Minutes, http://wiki.i-marine.eu/index.php/7.12.2012-TLO_FLOD_Ontological_Analysis.
- Ticket, https://issue.imarine.research-infrastructures.eu/ticket/224#comment:15
TCOM in Belgium, Ostende, 29.01.2013
Semantic Cluster Meeting, online, 13.02.2013
- Minutes
- Minutes from Leo: http://wiki.i-marine.eu/index.php/13.02.2013_Semantic_Cluster
TCOM in Italy, Pisa, 21.03.2013
TCOM in Italy, Rome, 26.03.2013
TCOM in Greece, Skiathos, 18.06.2013 (LATEST)
- Presentation, https://portal.i-marine.d4science.org/group/data-e-infrastructure-gateway/workspace?itemid=60ecbc87-fee1-4d98-9acb-beb9596289a4
- Minutes, http://wiki.i-marine.eu/index.php/6th_TCom_Meeting:_19th_June_2013_Discussions_and_Notes
Analysis, Design and Implementation
FLOD Ontological Analysis
This activity has the primary goal to provide a common understanding of the underlying model of the FLOD source. It has been considered necessary for the development of the MarineTLO. The associated ticket is https://issue.imarine.research-infrastructures.eu/ticket/888
- FLOD Ontological Analysis, 04-Dec-2012
- FLOD Ontological Analysis, 07-Dec-2012
- FLOD Ontological Analysis, 31-Dec-2012
ECOSCOPE Ontological Analysis
This activity has the primary goal to provide a common understanding of the underlying model of the ECOSCOPE source. It has been considered necessary for the development of the MarineTLO. The associated ticket is https://issue.imarine.research-infrastructures.eu/ticket/889
- ECOSCOPE Ontological Analysis, 13-Dec-2012
- ECOSCOPE Ontological Analysis, 11-Jan-2013, meeting, online
MarineTLO Design
The activities is related to the design of MarineTLO ontology. The associated ticket is https://issue.imarine.research-infrastructures.eu/ticket/890
- Describing the MarineTLO design, first draft, 21-Dec-2012
- Describing the MarineTLO design, second draft, 24-Dec-2012
- Describing the MarineTLO design, a complete report is going to be available soon.
MarineTLO Implementation
The activities is related to the implementation of MarineTLO ontology using OWL 2 language. The associated ticket is https://issue.imarine.research-infrastructures.eu/ticket/891
- Please look at http://wiki.i-marine.eu/index.php/Top_Level_Ontology#TLO_Products for the versions of the MarineTLO
- SPARQL endpoints (Virtuoso and OWLIM-Lite)
- Competency queries
MarineTLO Usage
This activity concerns with the identification of use cases motivating the need for having harmonized integrated information. It has been associated to the ticket https://issue.imarine.research-infrastructures.eu/ticket/900. Currently, we evaluate the MarineTLO ontology for the:
- Fact Sheet Generator. In this case, MarineTLO is used as top model for FLOD, ECOSCOPE and WoRMS to support the development of FactSheetsGenerator applications. An concrete example is the one provided by IRD (SpeciesFactSheetsGenerator) aiming at providing factual knowledge about the marine domain by mashing-up relevant knowledge distributed across several data sources.
- For Semantic Post-Processing of the results of keyword search queries. In this case, MarineTLO is used as knowledge model for semantic search in X-search meta-engine. In more detail, suppose that a user is looking for publications about tuna. Specifically he wants to find experiments that were applied to several species of tuna. So, he submits the query tuna and gets a sorted list of results and various categories of entities like Regional Fisheries Body, Species, FAO Country, etc. User realizes that the category Species may contain interesting entities. He notices that there is an entity with the label skipjack tuna which is a medium-sized fish in the tuna family found in tropical and warm-temperate waters. User wants to learn more information about that species. Specifically, he would like to see other species for which the skipjack tuna is predator or is prey. By clicking the icon next to the entity's name, user is able to instantly (at real-time) retrieve such information. In particular, in the back end, a SPARQL query is sent to the MarineTLO's endpoint asking for that information. Note that the 'Species' have been derived from FLOD, while the properties 'is predator of' and 'is prey of' have been derived from ECOSCOPE's knowledge base. That would be impossible without the exploitation of the MarineTLO.
MarineTLO Evaluation
This activity is related to the evaluation of the MarineTLO ontology. The associated ticket is https://issue.imarine.research-infrastructures.eu/ticket/892
A required activity for the MarineTLO evaluation is to populate it with concrete instances.
- To populate just the class Species, please follow the link
- To populate the class Species with predator and prey relationships, please follow the link
- Competence Queries, 22-03-2013
- MarineTLO Populating, 22-03-2013
- MarineTLO Evaluation Report, 22-04-2013
The Notion of MarineTLO Version
Each MarineTLO version consists of
- A release number
- OWL files
- A document that contains the scope notes of each class
- A set of competency queries
- A short description describing the changes
- It could also contain a set of mappings between data source and MarineTLO, each of them described as an OWL file
Evolution Process
Since last meeting in Rome (26-03-2013), we planned to release a new version every two months (for correcting errors, based on requirements, priorities, usage needs, etc).
Version 1.0.0 released on 26-03-2013
- Release Number: 1.0.0
- Documentation of the MarineTLO version 1.0.0 https://issue.imarine.research-infrastructures.eu/ticket/1764#comment:1
- The version 1.0.0 of the MarineTLO ontology contains classes, properties and instances to cover the species fact sheet
- Competence queries covering the species fact sheets
- Mappings in OWL: FLOD-TLO, Ecoscope-TLO and WoRMS-TLO
- Mappings descriptions
Version 2.0.0 planned to be released by end of July 2013
- Release Number: 2.0.0
- Documentation of the MarineTLO version 2.0.0
- http://goo.gl/vmWVSj and Ticket #1764
- The version 2.0.0 of the MarineTLO ontology in OWL
- Competence queries
- Mappings in OWL: FLOD-TLO, Ecoscope-TLO and WoRMS-TLO
- A new WoRMS Schema (Ver2.0.0)
- Mappings descriptions
Version 3.0.0 planned to be released by middle of October 2013
- Requirements, Management and Plan
- Release Number: 3.0.0
- Documentation of the MarineTLO version 3.0.0
- The version 3.0.0 of the MarineTLO ontology
- Competence queries
- Mappings in OWL: FLOD-TLO, Ecoscope-TLO and WoRMS-TLO
- Mappings descriptions
Version 4.0.0
- Release
- Version: 4.0.0
- Date: July 2014
- MarineTLO ontology in OWL
- Documentation of the MarineTLO version 4.0.0
- Requirements, Competency Queries
- Mappings
- Description: http://goo.gl/SniVs8
- OWL implementation: http://goo.gl/WU3xkG
Previous TLO Versions
- TLO Version 13/11/2012, https://portal.i-marine.d4science.org/group/data-e-infrastructure-gateway/workspace?itemid=832af2c7-70d2-4ac3-8985-ed1efcaa0a40
MarineTLO-based Warehouses
Warehouse 1 (June 2013)
- Virtuoso Repository
- Virtuoso Repository Browsing
- OWLIM Lite Repository
Warehouse 2 (By the end of July 2013)
The warehouse is available at http://62.217.127.213:8890/sparql and the graph name is <http://www.ics.forth.gr/isl/TLObasedDataWarehouseV2>. We have also installed some plugins for browsing the warehouse, specifically one can use http://62.217.127.213:8890/fct/ and http://62.217.127.213:8890/fct/demo_queries.vsp that has some demo queries (use “Run with iSPARQL”, because the “Run in SPARQL endpoint” plugin prunes whatever follows the ‘#’ character). The total number of results this plugin returns is limited to 50.
A summary of the warehouse contents follows:
- TLO version 2
- FLOD
- ECOSCOPE
- part of WoRMS (information about the taxonomies of approximately ~1100 species obtained through Species Discovery Service and the wrapping software developed)
- (marine) part of DBpedia (containing various information about marine species)
In numbers, this warehouse contains approximately 1.6 million triples about 19,000 distinct marine species.
A paper describing the process of creating the MarineTLO-based warehouse can be found here.
EVALUATION and USAGE
The new warehouse has been evaluated using a new set of competency queries.
X-Search now uses this warehouse (instead of FLOD) and can identify 25,000 marine species (this number includes species genera and family names). Furthermore, each species in the TLOMarine-based warehouse has in average 30 properties, while in FLOD each species has in average only 6 properties.
We are also in continuous collaboration with IRD who is testing (and provides requirements) for the needs of FactSheetGenerator.
FUTURE
Continuous inspection of the warehouse contents, exploitation issues (new queries, etc.), documentation. A next version of the warehouse will be constructed certainly when we reach Marine TLO v3 (scheduled for Oct 2013) based on the requirements that we continuously receive from IRD.
Warehouse 3 (ONGOING, by the end of October 2013)
- Virtuoso Repository
- NameGraph
- Virtuoso Repository Browsing
Warehouse 3+ (ONGOING, by the middle of January 2014)
- Virtuoso Repository
- NameGraph
- Virtuoso Repository Browsing
MarineTLO Related Tickets
First Iteration
- FLOD Ontological Analysis: https://issue.imarine.research-infrastructures.eu/ticket/888 CLOSED
- Ecoscope Ontological Analysis: https://issue.imarine.research-infrastructures.eu/ticket/889 CLOSED
- MarineTLO Design: https://issue.imarine.research-infrastructures.eu/ticket/890 CLOSED
- MarineTLO Implementation: https://issue.imarine.research-infrastructures.eu/ticket/891 CLOSED
- MarineTLO Results: TLO Usage https://issue.imarine.research-infrastructures.eu/ticket/900 CLOSED
- MarineTLO Evaluation, https://issue.imarine.research-infrastructures.eu/ticket/892 CLOSED
- MarineTLO Population, https://issue.imarine.research-infrastructures.eu/ticket/1220 CLOSED
Second Iteration
- MarineTLO Version 2.0.0, https://issue.imarine.research-infrastructures.eu/ticket/1603 CLOSED
- MarineTLO Version 2.0.0 Documentation, https://issue.imarine.research-infrastructures.eu/ticket/1764 CLOSED
- Documentation of the process used for creating MarineTLO-based warehouses, https://issue.imarine.research-infrastructures.eu/ticket/1848 OPEN
- Creation of a MarineTLO-based warehouse 2 CLOSED
- Actions for exploiting the MarineTLO-based Warehouse 2 OPEN
Third Iteration
- MarineTLO Version 3.0.0, https://issue.imarine.research-infrastructures.eu/ticket/2046 CLOSED
Fourth Iteration
- MarineTLO Version 4.0.0, https://issue.imarine.research-infrastructures.eu/ticket/2319 OPEN
Related Papers
- Y. Tzitzikas, C. Allocca, C. Bekiari, Y. Marketakis, P. Fafalios, M. Doerr, N. Minadakis, T. Patkos and L. Candela , “Integrating Heterogeneous and Distributed Information about Marine Species through a Top Level Ontology”, 7th Metadata and Semantics Research Conference, MTSR 2013, Thessaloniki, Greece, November 2013.