Difference between revisions of "16.01.2014 BiolDiv"
Line 19: | Line 19: | ||
Fabio gives a possible explanation for the different behavior between the SIMPLE and the GNI parser: the GNI parser is better on complex, but well formatted inputs. | Fabio gives a possible explanation for the different behavior between the SIMPLE and the GNI parser: the GNI parser is better on complex, but well formatted inputs. | ||
Edward asserts the detected behavior for the parsers is not surprising. | Edward asserts the detected behavior for the parsers is not surprising. | ||
− | Edward suggests to have contacts with other people involved in Taxa matching to search for collaborations. | + | Edward suggests to have contacts with other people involved in Taxa matching to search for collaborations; more specifically, we should look for collaboration with the people around GNI if we want, as per Yde de Jong's suggestion, keep track of all the misspellings we have resolved. |
Anton asks to have one BiOnym recognizer available both on the FAO and on the i-Marine website | Anton asks to have one BiOnym recognizer available both on the FAO and on the i-Marine website | ||
Line 28: | Line 28: | ||
* double check the WoRMS Taxamatch performance (GP just noticed there was a mistake in the description of the WoRMS Web Service which could have influenced the evaluation) | * double check the WoRMS Taxamatch performance (GP just noticed there was a mistake in the description of the WoRMS Web Service which could have influenced the evaluation) | ||
* produce a matrix reporting the amount of complementary errors made by two pairs of Matchers inside BiOnym (to estimate to which extent Levenshtein can substitute the entire WF) | * produce a matrix reporting the amount of complementary errors made by two pairs of Matchers inside BiOnym (to estimate to which extent Levenshtein can substitute the entire WF) | ||
+ | * expand on the matrix with performances of single-matcher WFs; now the table has the performance for 10 names returned; we need similar tables for 6, 2 and 1 name returned | ||
* introduce Beam-Search options on the Statistical Manager | * introduce Beam-Search options on the Statistical Manager | ||
* develop a local\fast version of BiOnym to be used by the websites | * develop a local\fast version of BiOnym to be used by the websites |
Latest revision as of 14:36, 16 January 2014
Meeting 16 January 2014, 11:00 am
Google Hangout
Present: Edward, Anton, Fabio, GP
Notes
BiOnym Performance
GP presents the performance of the BiOnym workflow on the benchmark datasets provided by Edward. The performance is reported for several lengths of the output list and using two parsers (Dima's GNI and Fabio's SIMPLE). The report highlights the following points:
- BiOnym WF outperforms the WoRMS taxamatch in all the configurations
- the WF using the SIMPLE parser by Fabio gains better performance than the one using the GNI parser
- the WF using the SIMPLE parser and only the Levenstein distance as matcher, is the best system to recognize the automatically generated species names (this is not true for real names)
Fabio gives a possible explanation for the different behavior between the SIMPLE and the GNI parser: the GNI parser is better on complex, but well formatted inputs. Edward asserts the detected behavior for the parsers is not surprising. Edward suggests to have contacts with other people involved in Taxa matching to search for collaborations; more specifically, we should look for collaboration with the people around GNI if we want, as per Yde de Jong's suggestion, keep track of all the misspellings we have resolved.
Anton asks to have one BiOnym recognizer available both on the FAO and on the i-Marine website
Next Steps
GP has to:
- double check the WoRMS Taxamatch performance (GP just noticed there was a mistake in the description of the WoRMS Web ServiceSelf-contained, self-describing, modular application that can be published, located, and invoked across the Web. Web services perform functions that can be anything from simple requests to complicated business processes. Once a Web service is deployed, other applications (and other Web services) can discover and invoke the deployed service. which could have influenced the evaluation)
- produce a matrix reporting the amount of complementary errors made by two pairs of Matchers inside BiOnym (to estimate to which extent Levenshtein can substitute the entire WF)
- expand on the matrix with performances of single-matcher WFs; now the table has the performance for 10 names returned; we need similar tables for 6, 2 and 1 name returned
- introduce Beam-Search options on the Statistical Manager
- develop a local\fast version of BiOnym to be used by the websites
Fabio has to:
- check with FIN if their implementation of GSAy with the YASMEEN Framework can be substituted to the currently running one
FIN has to:
- give updates about the status of the BiOnym web interface
- give a feedback about the effectiveness of their GSAy implementation with the YASMEEN framework
Next meeting
Thursday 23 January 2014, 11 am.