8th TCom Meeting: 4th February 2014 Discussions and Notes
8th TCom Meeting: Participants
Join Live: https://plus.google.com/hangouts/_/7ecpiv9rbp9k7j23bktfanoiqs?authuser=0
WP9 - Data Access
Presenter: Massimiliano Assante (CNR) - Slides
- New Home Library features in gCube 3.0
This has implications on all the clients of the HL, clients have to deal with ACLs.
The folder created to store search results should be a "non removal" folder, applications can set the privileges they want.
Think about the possibility to index the user workspace via the NKUA machinery. It is needed an approach to (incrementally) harvest the workspace content and to cope with the workspace evolving space.
The new version of the HL is backward compatible, however the requested changes are minor.
The medium-term plan is to replace the Storage Management Lib with the HL, because of a number of facilities offered by the HL including policy management.
WP9 - Data Transfer
Presenter: M. Simon (CERN) - Slides
- Overview of the enhancement on the Storage Manager portlet and demo
- Presentation of the Service and API in order to stimulate integration with the rest of the system
The facility should be revised to deal with the new version of HL, namely ACL.
The layout should be revised, with the goal to serve an average user, e.g. to simplify the offered options.
Since WP9 is over, they are planning to report the activity in WP11 and WP5 (to check).
In the case of Scheduled Transfer it would be nice to implement a filter on the transferred data to only transfer changed files for every scheduled run.
CERN is going to check the effort to implement this change.
WP10 - Data Retrieval
Presenter: A. Antoniadis (NKUA) - Slides
- Federated Search Status and Plan
WP10 - Data Ingestion
Presenter: J. Gerbesiotis (NKUA) - Slides
It might be useful to decouple DB access information (e.g. JDBC) from data schema information while configuring the SQL2XML adapter.
- A ticket has been created for this #2600;
An approach to access the actual payload should be envisaged. It is currently based on IDs.
The priority should be given to:
- data sources that we do not serve yet including:
- Biodiversity data, via SPD and DarwinCore-Archive files;
- GIS data sources, e.g. via GeoNetwork that means CSW;
- These options should be investigated.
WP10 - Data Publishing
Presenter: N. Laskaris (NKUA) - Slides
- new OAI-PMH, OAI-ORE
A long discussion on making publicly available data / metadata. It was clarified that we should not mix the actual data / resource with the metadata that are exposed via OAI-PMH.
From a technical point of view, the machinery is there. It should be up to WP3 to discuss the policy to be implemented.
WP10 - XSearch
Presenter: Pavlos Fafalios (Forth) (remotely) - Slides
- XSearch and Xlink related activities
The configuration of entity linking (regarding X-Link) is based on categories of entities. Specifically, for each category the user/admin can provide the SPARQL endpoint and the SPARQL template query.
The services are not yet implemented. Currently, FORTH is working on the output format.
Although the analysis of a collection of documents is not supported currently by X-Link, its support is straightforward.
Comments from Claudio Baldassarre (FAO)
- X-Link has a good potential to support use case application like SmartFish; although xlink hasn't been adopted by any use case partner in iMarine, on whose feedback we can base more positive expectations.
- The capacity of scanning a document for entity mining at runtime is interesting, and in the case of SmartFish should be available in a form that can target a document collection, and run it batch with an output that avoid the live running on a single document from the result set.
- The capacity of customizing the source SPARQL endpoint(s), and the set of relationships networking together the entities are interesting for an adoption of xLink by Smartfish
- The design of xLink low-level library, the generic client library, and the xSearch client library are also promising for an adoption of xLink by SmartFish
- A deeper investigation on the algorithm of subgraph selection should be taken in consideration to see if it can adopted as an additional asset by SmartFish
- I would like to have pdfs of publications mentioned during the presentation
WP10 - Semantic Data Analysis
Presenters: Carlo Allocca (Forth), Nikos Minadakis (Forth), Yannis Tzitzikas (Forth) - Slides
- Marine TLO
- Warehouse
- Warehouse construction process
An Android application Ichthys has been developed. FORTH should check if the "About" part contains enough acknowledgments.
On TLO future versions:
- CNR (Gianpaolo) provided FORTH with a revised version of a number of data sources produced via SPD discovery facility;
- This was integrated, although FORTH focused on "marine fishes" ... to be clarified;
- Provenance management should be reinforced by relying on existing "standards", e.g.:
- it is important to capture when a given information has been collected;
- it is important to capture how a given information has been collected / produced;
- data providers have their own policies characterising how their data should be "cited";
On ByCatch modeling:
- it is very important to have feedback from IRD (because of their institutional mandate);
- FAO will analyse the proposal also;
WP10 - OWS
Presenter: Hervé Caumont (Terradue) (remotely) - Slides
- OWS Context API and Visualization tools
- Plan for exploitation in iMarine
CNR can not allocate effort on this before middle of March. What about the others?
- T2 can allocate Francesco, so that during February he can do what is needed to further serve iMarine needs;
- a Telco should be organised between CNR and T2 (at least) to reach a common understanding;
WP9/10 WPS and SOS service
Presenter: Hervé Caumont (Terradue) (remotely) - Slides
- WPS status and plans
- Using OGC SOS and O&M in iMarine
Is SOS enough and powerful to capture "big data, e.g. 1 billion of records such as the GEBCO measures
- in particular, GP is facing the problems related with environmental enrichment process (where the scale is different)
- Hervé replied by saying that this is probably not the case / scenario expected to be served; SOS seems to be more suited to index services that host sensor data but not for providing their full content.
A telco should be organised to agree next steps
On WPS deployment in iMarine production:
- the current service (insulting cluster) is hosted by Terradue;
- one of the open issue is about incompatibility between Hadoop clusters;
- a ticket should be created to monitor this activity;
- Actually a ticket is there #1274
WP9 - Tabular Data Manager
Presenter: P. Pagano (CNR) - Slides
- Tabular Data Manager and its libraries
- Release status: features and capabilities
- Plan for the next releases
WP10 - Data Mining and Visualization
Presenter: G. Coro (CNR) - [ Slides]
- Status of the Statistical Manager platform
- Algorithms currently supported
- Distributed R support
- Missing bits