28.11.2013 BiolDiv

From D4Science Wiki
Jump to: navigation, search

Meeting 28 November 2013, 11:00 am

Google Hangout

Present: Nicolas, Casey, GP, Fabio, Lino, Edward

Notes

TDWG meeting

Nicolas briefed us on the TDWG meeting. His observation was that, in contrast to previous meetings, there were many more functional systems, ready-to-use applications that were presented, while in previous meetings, more presentations were on plans rather than finished work. Seems with BiOnym we're still in the planning phase.But iMarine is in a good position to make a great contribution to biodiversity informatics community thanks to the infrastructure that has been developed.

Other talks relevant to name reconciliation were given by people from GBIF, HCMR and Berlin Botanical Garden.GP presented, in a different session, work he did on Bayesian statistics in collaboration with Rainer Froese.

There seems to be a common interest in Parallel R between HCMR, Forth and iMarine, and to deploy it for LifeWatch. To be followed up, involving Christos Arvanitidis.


Future of BiOnym

From the report of the TDWG meeting, it was again very clear that we need to 'package' BiOnym and find external users. Next steps should be to develop a user-friendly interface, develop a training programme and documentation, and to publicise.

Potential of the EU as a source of funding of BiOnym (and other iMarine activities) was discussed. EU plans to have bigger, but fewer projects. Emphasis will be on services. A business plan will be a requirement at submission, not as a deliverable on the end. The Horizon 2020 meeting in Rome was organised in part to start forging partnerships. More information on the meeting is on http://conference.lifewatch.unisalento.it/index.php/EBIC/BIH2013; more on potential follow up and possible partnerships on http://h2020.myspecies.info/ (go to 'potential partners' and potential projects' on the main menu).

Development of BiOnym

Fabio reduced the size of the YASMEEN jar from 12 to 3 MB. He also had a close look at a Java implementation of Tony Rees' taxamatch, finding inspiration to make Yasmeen more efficient. See https://code.google.com/p/ala-nsl/source/browse/taxamatch/trunk/src/au/org/biodiversity/services/taxamatch/impl/TaxamatchServiceImpl.java?r=4#75 for the Java code.

Nicolas will contact Aaike De Wever to check whether the taxonomy of FADA can be made available as well. Ideally the format would be DwCA.

We want to compare performance of BiOnym against Tony Rees' Taxamatch and VLIZ Taxon Matcher; we might also compare with Dima's tools.

We need a user-friendly interface before we can ask third parties to try and use our taxon name matching tools. Defining this interface will be one of the priorities for the near future. There should be a dual interface, separating functionality for naive from that meant for more sofisticated users.

The interface can take the format of a separate VREVirtual Research Environment.. There were already plans to create a 'Biodiversity Lab'; BiOnym will be added.

The interface developed by Casey at FIN is now only available locally. Casey will work with iMarine staff (Angela, Maximiliano) to bring her work to the infrastructure; as there are some differences in the algorithms there could be some complications in establishing this integration. There are web pages on the Wiki to explain how to build a user interface, and how to use the interface of the statistical manager.

The statistical manager now has tools to upload lists of names to be tested. The tools to upload Taxonomic Authority Files in DwCA format are still to be built. Customisations of the character substitutions to be used in the comparisons, and the rules for pre-processing data cleaning, still has to be created. Fabio documented the XML format that is used by YASMEEN on http://wiki.i-marine.eu/index.php/YASMEEN_input_data_parser#Pre-parsing_rules; an interface will be created to edit these transformations and create the corresponding XML file on the fly.

Next meeting

Monday 9 December 2013, 11 am.