Difference between revisions of "Ecosystem Approach Community of Practice: OBIS"

From D4Science Wiki
Jump to: navigation, search
m (OBIS Current Situation)
m (OBIS Current Situation)
Line 1: Line 1:
 
== OBIS Current Situation ==
 
== OBIS Current Situation ==
  
The Ocean Biogeographic Information System (OBIS, http://www.iobis.org[http://www.iobis.org www.iobis.org]) is a complete data-management environment that is based on a Postgresql database.  
+
The Ocean Biogeographic Information System (OBIS, [http://www.iobis.org www.iobis.org]) is a complete data-management environment that is based on a Postgresql database.  
  
 
OBIS manages the upload of data from a variety of sources (list) and formats (csv, ...), and offers many database functions and procedures for sorting, filtering, merging, etc. It also includes advanced analytical support, as postgresql queries and pl/pgsql functions. The resulting OBIS rdbms feeds data to the OBIS website, and to upstream aggregators such as GBIF and EOL.   
 
OBIS manages the upload of data from a variety of sources (list) and formats (csv, ...), and offers many database functions and procedures for sorting, filtering, merging, etc. It also includes advanced analytical support, as postgresql queries and pl/pgsql functions. The resulting OBIS rdbms feeds data to the OBIS website, and to upstream aggregators such as GBIF and EOL.   

Revision as of 18:45, 23 February 2012

OBIS Current Situation

The Ocean Biogeographic Information System (OBIS, www.iobis.org) is a complete data-management environment that is based on a Postgresql database.

OBIS manages the upload of data from a variety of sources (list) and formats (csv, ...), and offers many database functions and procedures for sorting, filtering, merging, etc. It also includes advanced analytical support, as postgresql queries and pl/pgsql functions. The resulting OBIS rdbms feeds data to the OBIS website, and to upstream aggregators such as GBIF and EOL. It can also:

  • generate geospatially explicit data (georeferenced)
  • display data in an integrated MapViewer;
  • export data over R-ODBC to a stand-alone R-environment;
  • etc.

An extensive description of OBIS can be found on-line ....

The user-guide of OBIS is available here: guide

OBIS DB

The OBIS aims to provide the .


A progress report

OBIS VREVirtual Research Environment.

Many unfinished sections

OBIS Postgres database management OBIS Pgadmin OBIS R-ODBC OBIS Users Management OBIS Data vizualization (Table / Chart / Map) OBIS Data Export …. …. ….


IRD Data Access

IPT IRD has discussed services it expect to contribute to OBIS. IRD has set up the GBIF IPT last year on a server at IRD to provide access to part of IRD’s data:

  • metadata with EML metadata format,
  • data with Darwin Core data format,

http://vmirdgbif-proto.mpl.ird.fr:8080/ipt/

In iMarine, a workflow has to be defined that allows the reuse of these data to populate the OBIS database. The data can be obtained from GBIF or by connecting OBIS system the IPT instance of IRD. The obvious problem is to avoid duplication of data in the OBIS system from different dataflows.

HIT Currently, OBIS does not interact with IPT – even though it increasingly replaces DiGIR providers. iMarine WP6 should validate the harvesting toolkit developed by GBIF (HIT, if I'm not mistaken) and connect it to the PostgreSQL instance of OBIS on the D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. infrastructure. That would kickstart collaboration with WP6 and provide a powerful tool to harvest not only IRD but also other data.

To avoid 'Crop circles', a flag in the metadata whether a dataset can flow upstream or not. Aloso, data that have already been submitted to GBIF should be recognizable; this avoids that OBIS in iMarine passes that data to GBIF, and that there is no need to harvest from GBIF (if it is already harvested directly from IRD).


IRD data accuracy

The IPT of IRD can expose much more data than currently shared with GBIF. IRD currently does not share these data because of location accuracy issues. OBIS / iMarine has addressed this issue in the past (e.g. for trawlnets) and IRD seeks a solution through the iMarine collaboration.

IRD collects data from landings of purse seiners, and these do not always reflect exactly where the fishes were caught (various possible fishing operations over months and huge areas). Instead of a range of accurate points IRD has many possible points; resulting in a polygon that can be small or big). IRD has developed a range of different algorithms to transform polygons into points that could be used in OBIS.

IRD can share more data located with polygons instead of points at the end of this year.

For OBIS, harvesting directly is important; it provides the option to be more specific for marine data. For instance, ingesting transects as start and end-points, rather than a single point. OBIS can provide details on the differences between Darwin Core and the OBIS Schema. Other extension requirements are dealing with polygons and sets of points.


IRD biological parameters

IRD can share biological parameters such as weight and length as well for some of its datasets, and share the sampling / survey method used in the collection of the data.

Lengths and weights would be an interesting extension, but we'd have to extend the OBIS Schema in order to deal with those. I'm a bit reluctant to do this directly. I think we first should investigate rewriting the OBIS Schema as a formal extension to the new extendible Darwin Core, and then do length and weight as an extension of the extension, or as a direct extension of GBIF's Darwin Core (assuming we can have several extensions in parallel).


IRD data coverage

Spatial

IRD data coverage OBIS data coverage

EurOBIS is not in charge of the Indian Ocean data.

Temporal

The IRD datasets start in (list sets) They are updated continuously / monthly / annualy


OBIS Website

OBIS VREVirtual Research Environment. Profile

Product

Describe the proposed solution in maximum 3 sentences:

With ICIS capture time-series can be

Priority to CoPCommunity of Practice.

List proposed solution priority following the iMarine Board priority setting criteria:

  • Identified community: Users now:
  • Potential for co-funding:
  • Structural allocation of resources:
  • Referred in DoW:
  • Business Cases:
  • How does the proposed action generally support sustainability aspects
  • How consistent it is with EC regulations/strategies (eg INSPIRE, ... ):
  • Re-usability – benefits – compatibility

Parentage

Relation to CoPCommunity of Practice. Software Relation to D4S technologies

Does the proposed solution solve other problems associated with EA-CoPCommunity of Practice. Business Cases?

If the proposed solution can be used in another SW scenario (not users!) please describe.

Public

How big is the expected user community after delivery?

Productivity

Are the proposed measures effective?

Does it reduce a known workload?

Price

Is the proposed solution cheap?

Expected effort in PM:

Presentation

How is the component delivered to users? (Design / on-line help / training material / support). The OBIS VREVirtual Research Environment. will be a VREVirtual Research Environment. that build a data-load and validation interface around an existing Postgresql db.

The VREVirtual Research Environment. will not replace all existing data structures and services, and a pgAdmin is expected, even with very resticted grants and rights on the DB instance.

The VREVirtual Research Environment. will offer

  • Data discovery through DiGIR, IPT, and HIT
  • Data Loading to the DB
  • Data Vizualization in a map
  • Interactive data management with a tabular and map interface.
  • ...


Privacy

Are they safe?

The Postgresql stores data ...

Access is only possible through ....

Need the proposed solution to manage confidential info at data / dataset / organizational level? None of the data is confidential.

Describe security and privacy issues:


Policy

Are there any policies available that describe data access and sharing?

Yes.

Are these really needed?

Yes, OBIS combines data from many different resources and it is important to keep track of the ownership, and provide proper attributions in all products. Without a well-defined data access and sharing rule-frame, it is also very difficult to identify replicates; e.g. if a dataset has already been uploaded to GBIF, and if it has alreday been reviewed.

Copyright / attribution / metadata / legal

The attribution records are most important.

Perils

Do they introduce moral hazard? (A hazard here is the risk that users will behave more recklessly if they are insulated from the effects of the software, or if they do not understand what it produces, where data come from, what they represent etc. .)

The OBIS VREVirtual Research Environment. carries no risks to users and or developers.