16.01.2014 BiolDiv
Meeting 16 January 2014, 11:00 am
Google Hangout
Present: Edward, Anton, Fabio, GP
Notes
BiOnym Performance
GP presents the calculated performance of the BiOnym workflow, with variating number of outputs and of using the two available parsers (Dima's GNI and Fabio's SIMPLE). The report highlight the following points:
- BiOnym outperforms the WoRMS taxamatch in all the configurations
- the WF using the SIMPLE parser by Fabio gains better performance than the one using the GNI parser
- the WF using the SIMPLE parser and only the Levenstein distance is the best system to recognize the automatically generated species names (this is not true for the real names)
Fabio explains the rationale behind the different behaviour between the SIMPLE and the GNI parser: the GNI parser is better on complex, but well formatted, inputs. Edward asserts this behaviour is not surprising. Edward suggests to have contacts with other people involved in Taxa matching to search for collaborations.
Anton asks to have one BiOnym recognizer available both on the FAO and on the i-Marine website
Next Steps
GP has to:
- double check the WoRMS Taxamatch performance (GP just noticed there was a mistake in the description of the WoRMS WebService which could have influenced the evaluation)
- produce a matrix reporting the amount of complementary errors made by two pairs of Matchers inside BiOnym (to estimate to which extent Levenshtein can substitute the entire WF)
- introduce Beam-Search options on the Statistical Manager
- develop a local\fast version of BiOnym to be used from the websites
Fabio has to:
- check with FIN if their implementation of GSAy with the YASMEEN Framework can be substituted to the currently running one
FIN has to:
- give updates about the status of the BiOnym web interface
- give a feedback about the effectiveness of their GSAy implementation with the YASMEEN framework
Next meeting
Thursday 23 January 2014, 11 am.