CodelistManager

From D4Science Wiki
Jump to: navigation, search

Context

Statistical cluster

CodelistManagerDesign

CotrixBuild

Cotrix configuration and deployment scenarios

Vox

Domain Model

This is work in progress and has very much draft status.


Core

CoreDomainModel.jpg

Collection: An aggregation of code lists and hierarchies. In SDMX a hierarchy is called a HierarchicalCodelist.


TODO: Leave, Joine and Merge need to be modeled better. TODO: MasterConcepts can have cardinalities with itself as well

Note: A simplified version of this model is implemented in cotrix-tabular: https://github.com/cotrix/cotrixrep/tree/master/cotrix/cotrix-tabular/src/main/java/org/cotrix/domain The domain model in cotrix-tabular can be considered as the implementation model.

Documentation

Documentation.jpg

Workflow of an artefact

Workflow.jpg WorkflowClassDiagram.jpg

The WorkflowStatus has as possible values the possible artefact states.

Note that the workflow is aware of an artefact, not the other way around. Doing so, the workflow can be a pluggable module.


Type (Artefact or Code) Description
LocallyCreated This is the default type. Created or imported and the further lifecycle and management is in the system
ImportedImmutable Imported from outside and cannot be changed. Will have a CodeLife. The lifecycle and management is outside of the system but will be followed and monitored in the system.
LifeLinked Is only linked, not stored and at most cached. Works in principle only in case the outside link is available. Will not have a CodeLife, the lifecycle and management is outside of the system

TODO: document all states.

Statechart of a Code

StatechartDiagramCode.jpg The CodeStatus has as values the possible code states. Codestatus.jpg

  • A code can become final when:
    • it is published in a code list
    • it is made final
  • A code becomes non final when it was final and has been changed
  • A code is non final when it is created
  • A code can only change from final to non-final when it was not yet published in a code list
  • Changing the validityPeriod, wellKnownText or value of a final code will result in a copy of that code. The new code will be non final.
  • Creating a new Code means also creating a new CodeLife
  • Making a copy of a Code results in adding a link from that new Code to its CodeLife


PartialCodelist

PartialCodelist.jpg


Union

Union.jpg

Union is similar to Collection and PartialCodelist, but different!

A Collection will be published as 1 Artefact. Within the published artefact you may still be able to see the original artifacts where it was constructed from.

The difference is that a Union will act like one Codelist. A Union can only be created from other code lists. In its published form it is not always possible to relate back to the original artifacts where it was constructed from.

Use Cases

CoreUseCases.jpg


UseCase import csv

  • User selects CSV file to import from
  • System interprets the CSV file
  • User chooses to accept the interpretation or decides to manually intervene
  • User can manually intervene by assigning the columntype to each column of the original CSV (columntypes are code, description or annotation)
  • User can manually intervene by changing the cardinalities between the codecolumns (1-n, 1-1, n-1)
  • User gives version name
  • User makes the artefact(s) final and sends the artefact(s) for approval
  • Approver approves or denies the finalised artefact(s)
  • Approver sends the artefact(s) for publication
  • Publisher publishes the approved artefact(s)

Note: In the future we can think of importing from SDMX, JDBC or any other source.

UseCase import csv Example

A good example for the import csv file is the ASFIS species list. The Asfis species list is a zip file, containing the file ASFIS_sp_Feb_2011.txt, which is a csv file. The implicit hierarchies in this file are documented here. http://www.fao.org/fishery/collection/asfis/en documented here http://km.fao.org/FIGISwiki/index.php/ASFIS_SDMX_Codelist

After having imported the ASFIS file, the following code lists are interpreted:

  • ASFIS Species Alpha 3 Codelist
  • ASFIS Species Taxonomic Codelist
  • ASFIS Species Family Taxonomic code list
  • ASFIS Species Order Taxonomic code list

and hierarchies:

  • Relation ASFIS Species Taxonomic code - Alpha 3 code
  • Relation ASFIS Family - Species
  • Relation ASFIS Order - Family

and collections

  • ASFIS List of Species

Interpreted means that the system is capable of understanding all the implicit relations in the tabular format file like the the ASFIS_sp_Feb_2011.txt file and shows in the UI distinguished code lists, hierarchies and collections. The ASFIS_sp_Feb_2011.txt file results therefore in 4 codelits, 3 hierarchies and 1 collection.

The collection ASFIS List of Species is containing the same information as the original ASFIS_sp_Feb_2011.txt file.

UseCase create new version of an Artefact

  • Start from scratch, import, or copy an existing Artefact in order to work on a new version of an Artefact.
  • Delete codes/hierarchies
  • Add codes/hierarchies
  • Edit codes/hierarchies
  • View deleted codes/hierarchies
  • View added codes/hierarchies
  • View edited codes/hierarchies
  • Make Artefact final

UseCase publish

A Collection, Codelist or Hierarchy can be published through SDMX, CSV:

  • Codelists are published as SDMX code lists according the SDMX REST specifications.
  • Hierarchies are published as SDMX hierarchical code lists according the SDMX REST specifications
  • Collections are published as zip, txt, zip containing a txt file or zip containing a csv file. Such a collection would represent for instance the original ASFIS txt file.


UseCase Union

  • select 2 or more code lists
  • publish them as 1 code list or layer

UseCase DiffReport

  1. User select artefact(Codelist, HierarchicalCodelist or Collection).
  2. User selects a certain version from that artefact.
  3. User selects another version from that same artefact.
  4. User clicks on generate DiffReport and views the DiffReport

The report shows:

  • Codes added.
  • Codes deleted.
  • Number of codes in the first and second selected version.


UseCase publish layer as code list

ImportLayer.jpg

  • Import layer (shapefile)
  • .... process generic edit and approve functions
  • Publish as CSV and SDMX

Low prios:

  • Publish as WFSWeb Feature Service and WMSSee Workload Management System or Web Mapping Service.(format shape)
  • Publish in PostGis
  • Publish in Oracle Locator


  • The geometry is expressed as Well-known text(WKT) http://en.wikipedia.org/wiki/Well-known_text
  • Language dependent attributes from the shapefile are expressed as descriptions
  • Non language dependent attributes from the shapefile are expressed as annotations


  • The geo-code list end-product should handle source layer provenance information, i.e. from a tabular data column curated with such geo-code list, we must be able to know the GIS layer provenance information. The layer provenance information should be enought to point back on the layer. This information should include at least (1) the Geoserver base URL & (2) the layer name

Such layer provenance information is required in the SPREAD scenario, for intersection Data Discovery.

UseCase publish layer as code list Example

The practical case behind this usecase is the FAO major areas:

http://km.fao.org/FIGISwiki/index.php/FMA_SDMX_Codelist

After having imported the FAO areas layer, the following code lists are interpreted:

  • FAO Production Area code list (from major area to sub-unit)
  • FAO Major Water Area code list
  • FAO Major Water Area Subarea code list
  • FAO Major Water Area Division code list
  • FAO Major Water Area Subdivision code list
  • FAO Major Water Area Subunit code list

and hierarchies:

  • Relation Area code - Subarea code
  • Relation Subarea code - Division code
  • Relation Division code - Subdivision code
  • Relation Subdivision code - Subunit code

This practical case will follow this Workflow of an artefact: Imported, Interpreted, Immutable, Final, Approved and Published. The editing work will be done in ArcGis. The the reference will be that shapefile edited by ArcGis.

Eventually this usecase can replace the shp2Oracle and re-index functionality, currently used by Fabio Carocci.

Core Rules

  • A code can become final when:
    • it is published in a code list
    • it is made final
  • A code becomes non final when it was final and has been changed
  • A code is non final when it is created
  • A code can only change from final to non-final when it was not yet published in a code list
  • Changing the validityPeriod, wellKnownText or value of a final code will result in a copy of that code. The new code will be non final.
  • Creating a new Code means also creating a new CodeLife
  • Making a copy of a Code results in adding a link from that new Code to its CodeLife

Note: This table is not yet integrated in the model.

Nice to haves

  • Integration with SharePoint
  • Support for CMIS
  • Export to OWL
  • Export to SKOSS
  • Export to RDF
  • Merging
  • Mapping

Links

[1]TaxoTools