24.10.2013 BioDiv

From D4Science Wiki
Jump to: navigation, search

Taxon name matching

Hangout meeting Thursday 24th October 2014 – 11:00-12:00 CET

Participants: Caselyn Aldemita, Nicolas Bailly, Gianpaolo Coro, Anton Ellenbroek, Fabio Fiorellato


Matchers vs Matchlets After a discussion between Gianpaolo and Fabio on Wednesday 23rd Oct. 2013, a difference between the workflow approaches of SpecinaME and BiOnym was clarified:

  • Matchlets: one matching method to one atomic component of a full scientific name (G, S, A, Y) (SpecinaME’s approach).
  • Matchers: one matching method is applied to the full scientific name (BiOnym approaches).

The Matcher behavior can be emulated by SpecinaME by setting properly the parameters that will activate the relevant Matchlets.


Transformations

In Edward’s R emulation, some Matchers contain also transformations (and indexations) of names that in principle should be run 1) independently for the TAFs, and 2) in the pre-processing phase for the submitted datasets by users.

However, the sequence of transformations may be variable because the set up can be modified by the user during the run configuration phase.

Thus the transformation of the TAFs have to be run on the fly, which for the 250,000 names from OBIS/WoRMS) seems possible, but may take too much time for CoL for instance (or GNI/GNUB).

After discussion, it is decided to implement the matcher ‘in’ SpecinaME that are easy to implement quickly, to select one of few sequence of transformations that are usual, or later on from test results for the matcher including such transformation. And comeback on the second round to tackle the issue (e.g., parallelise the transformation of the TAFs when on the fly).


BiOnym interface

Casey and Gianpaolo solved the issue where to put the interface for viewing and improvement (not yet connected to functionalities). They will and the URL when installed. In any case, the final implementation will be a portlet in the Biodiversity Research Environment, calling the Workflow/Matchers/Matchlets from the Statistical Manager VREVirtual Research Environment..

TDWG presentation

Deadline for a final version: Tuesday 29 Oct, 2013 evening. The work remaining to be done was distributed among participant. A new version of the presentation will be posted in the workspace with the new indications (TDWG_BiOnym_131024-1.pptx in folder [1]).