Ecosystem Approach Community of Practice: VME iMarine

From D4Science Wiki
Revision as of 12:49, 28 August 2013 by Erik.vaningen (Talk | contribs) (Introduction)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Introduction

The VME project has started to investigate how the iMarine infrastructure could work in combination with the VME-DB on FAO side. This page tries to make all assumptions explicit and therefore will probably be heavily used for discussion!

The Reports function of iMarine was initially thought of in the context of the FIMES XML Schema for generating factsheets. This page instead explains how it could work, while trying to target an objects graph like the VME DB.


You can find the object graph for the VME-DB here, in order to have an idea of the VME domain model (ask Erik or Anton for the name/password).


This page talks about key-values as the solution to transfer data from iMarine into the VME-DB. The naked truth is that it is composite-key-value solution. The composite key is composed of the type of:

  • the object (KeyType)
  • the id of the object(id)
  • the name of the attribute that is to be manipulated.

A composite-key-value solution makes it easier on both sides to manage the data and works particularly well to manipulate an object graph like the VME-DB.

In order to distinguish from must-have to nice-to-have, the order of desired functions would be:

  • stateless iMarine Templates/KeyvalueEditor, Target webservice and publishing without approval chain
  • preview
  • integration with master data services
  • approval chain
  • restore




The big picture

IMarineIngestion.png


Objectives

In order of priority:

  • design a key/value transfer protocol that updates a remote object graph
  • the key/value transfer protocol is unaware of the domain (domain knowledge is added by the user through the template editor)
  • event-driven model
  • thread-safe and threadaware interface



UML iMarine

TemplateEditor

The VME-DB is an object graph. Every object in the VME-DB is represented in iMarine through a KeyType. Of course an object has attributes, therefore also a KeyType has attributes.


Examples of KeyTypes are Vme, Rfmo, GeneralMeasures, SpecificMeasures, etc.



TemplateEditor.jpg

  • Template: Holds a collection of KeyTypes. A Template defines the form in order to manipulate the objects in the VME-DB.
  • KeyType: This is the name of the object in the VME-DB
  • Attribute: is the name of the attribute of a certain object


  • TODO: Engineer and explain better the concept of Parent and Child. The more I think about it, the more I believe that such a concept is not needed. See also the example in Multilingual



Report, Delta and KeyValue

This diagram describes conceptually the model of a Report and the the format of a Delta message.


A Report is a collection of KeyValue(s). The Delta is a collection of KeyValue(s), representing a change of content.


KeyValue has the following attributes:

  • KeyType: see above TemplateEditor
  • id: is the id of the object in the in the VME-DB of a certain type(keyType).
  • attribute: see above TemplateEditor
  • value: the value for that attribute


Content.jpg

Target

The target (VME-DB) is from a iMarine point of view a collection of KeyValue(s)

Target.jpg

Webservice

Webservice.jpg


UML Target(VME)

In order to implement previews or to go back to older versions, a dedicated mechanism is needed on the target site.

Requirements are:

  • target system needs to be able to show a preview of draft content
  • target system does not need to be duplicated on DB level, in order to distinguish between different versions of the content
  • target system needs to be able to replace a published version with an older version
  • target system is not aware of the workflow context of iMarine.

This diagram show how this would conceptually look like. An object is the result of an older version of that object plus an applied delta. A business object would be a reference to a VME.

Workflow4object.jpg


Note1: The above proposal implies that the target has 2 websites and 1 database with objects. One website for published approved content, one website to show the draft content. It implies also that the website needs to be extended with the knowledge that it can only publish approved content. The other option could be to have 2 websites and 2 databases. Doing so, the website and its database do not need to know anything about iMarine. This other option is probably cleaner and needs to be designed and explained more.
Note2: The idea is to have 2 instances in the target, Publish and Staging. Publish holds only published business objects. Staging holds the preview and the old business objects. The preview website would show the published business objects, unless there is a preview business object available.

Id management

When creating a new object on the side of iMarine, a reference or id for this object is needed. The responsibility of Id generation is primarily on the target side(VME-DB). What can be done is to generate an ID on iMarine side, according this convention:

ID = LOCAL_{localID}

Example of such an id would be: VmeId = LOCAL_452.

A delta can define a number of new objects with all local ids. Within the delta, the local ids have a meaning. When the ingestion webservice will create the new object in the target, a new object will be created and the local id will be replaced with the new ID. The local ids will be destroyed after the delta has been ingested.

KeyValue, Delta and Status

Probably it is needed to send on key/value level a status. Possible values are:

  • new
  • update
  • delete

It could also be done implicitly through the id:

  • new: id=null
  • update: id=LOCAL_{localID}
  • delete: to be invented

but having explicit status values is probably cleaner.

As described above, a Delta contains a collection of key/values. A delta could distinguish in itself different collections for new, update and delete, in order to make the delta more compact. Since compression can be done on a webservice level, it is probably not advisable to implement compression on delta level, because it introduces more complexity to the protocol as a whole.

Multilingual

The assumption is that multiLingual aspects can be covered also by only key/values.


This implies that the Reports do not have to implement multilingual aspects. If the target (VME-DB) is multilingual, then Reports can handle that through key/values.


Please verify this example:

KeyType=Vme

id=10 attribute=Impacts value=20

KeyType=MultiLingualString

id=20 attribute=content value="Fishing activity has existed on the Corner Seamounts for many years, and catches over 10 000 tons were re-moved in the 1970s by USSR vessels, on seamounts both inside and outside the NAFO Convention Area, using a combination of bottom and mid-water trawling."

KeyType=MultiLingualString

id=20 attribute=lang value=en

Note: the above example describes also well in general how the key-value solution is envisaged. It also explains how a parent child relation can be managed. The convention for parent child would be that a delta, containing an update for a child, should also contain the parent.

Open Issues

  • How to handle reference data, lookup tables (integration with the Cotrix Master Data Manager through a Master Data Client)
  • How to handle validation rules
  • The names of this iMarine function can be confusing: Reports. A report reports generally about 'something', about a primary process. This function does not report but generates 'something', e.g. content. This function is all about editing content, to be send to a subsequent process. Alternatives names could be UniversalEditor, InfiniteEditor, ContentCreator, KeyValueEditor etc.
  • What to do if a delta corrupts the VME-DB? A solution could be to 'version' the VME-DB, so a previous version can be used in order to restore the website.
  • How to implement the preview function of draft content? Also here versioning of the VME-DB could be a solution.


Stateful or Stateless?

The above picture assumes that the iMarine infrastructure will not hold state regarding the VME-DB. Content will flow from the VME-DB into iMarine, will be subject to manipulation and a stream of deltas will flow from iMarine into the VME-DB trough the webservice. In this scenario, iMarine is stateless. iMarine will only hold state for the deltas and the approval chain.


The Stateful scenario would imply that the starting point of the lifecycle of the content is only within iMarine. The VME-DB is just an endpoint where the data is send to. The VME-DB will not hold primarily responsibility of holding the data in a meaningful and consistent way together.


The stateless scenario looks like the preferred scenario, but FAO needs to speak with the iMarine experts which scenario fits best.



Similar initiatives

Maybe there is no need to develop this by ourselves. Similar initiatives found out there are: