YASMEEN input-output filter

From D4Science Wiki
Revision as of 18:48, 31 October 2013 by Fabio.fiorellato (Talk | contribs) (-resultFile)

Jump to: navigation, search

"Yet Another Species Matching Execution ENvironment" - input-output filter CLI tool

Purposes

This is an optional YASMEEN CLI tool that can be effectively used to extract non-matching parsed input data as the intersection between an initial parsed input dataset and the results of a matching process for that same inputs.

It is particularly useful in the context of an iterative matching workflow, when non-matching input data need to be re-processed by different matchers (assuming these can ingest input data in the YASMEEN parsed input data format or by another run of the YASMEEN matching engine with different configurations.

Command line

java -jar YASMEEN-inout-filter-<version>.jar <options>

This CLI tool can be launched with the '-h' option to get a report of the available options:

java -jar YASMEEN-inout-filter-<version>.jar -h

Will give:

 -h                        Print this message
 -outFile <arg>            Specify the path to the file that will contain the filtered subset of the provided parsed input data
                           according to filtering configuration
 -outFileFormat <arg>      Specify the format of the file that will contain the filtered subset of the provided parsed input
                           data according to filtering configuration. Possible values are: {rawInput, parsedInput}
 -parsedInFile <arg>       Specify a path to a file containing YASMEEN input data in parsed input format
 -resultFile <arg>         Specify a path to the file containing YASMEEN matching results for the provided parsed input file
 -resultFileFormat <arg>   Specify the format of the file containing YASMEEN matching results for the provided parsed input
                           file. Possible values are: {rawInput, parsedInput}

General command line options

-h

This option requires no arguments, and - when set - will print the help message and exit

Input file command line options

-parsedInFile

Mandatory.

Specifies the path to an input dataset file (in the YASMEEN parsed input data format) that has already been processed and has produced a matching result output file in one of the formats available out-of-the-box in the YASMEEN matching engine.

Result file command line options

-resultFile

Mandatory.

Specifies the path to a matching results output file in any of the formats available out-of-the-box in the YASMEEN matching engine.

-resultFileFormat

Output file format

-outFile

-outFileFormat

Appendix

Download

You can download the YASMEEN input-output filter with one of this URLs:

Changelog

  • v1.1.1: first working implementation