Difference between revisions of "MaxEnt"

From D4Science Wiki
Jump to: navigation, search
(Inputs)
(Inputs)
Line 28: Line 28:
  
 
The inputs of the MaxEnt procedure are the following:
 
The inputs of the MaxEnt procedure are the following:
 +
 +
 +
{| class="wikitable" style="margin: 1em auto 1em auto;"
 +
|-
 +
! Parameter Name
 +
! Description
 +
! Example
 +
|-
 +
| SpeciesName
 +
| The name of the species to model and the occurrence records refer to. If the name is not important this can be a generic string.
 +
| Latimeria chalumnae
 +
|-
 +
| MaxIterations
 +
| The number of learning iterations of the MaxEnt algorithm
 +
| 1000
 +
|-
 +
| DefaultPrevalence
 +
| A priori probability of presence at ordinary occurrence points. Ref. [http://onlinelibrary.wiley.com/doi/10.1111/j.1466-8238.2010.00581.x/abstract Santika 2010]
 +
| 0.5
 +
|-
 +
| OccurrencesTable
 +
| A geospatial table containing occurrence records, following the template of the Species Products Discovery datasets. See section below for more details.
 +
| LatimeriaPointsTable
 +
|-
 +
| LongitudeColumn
 +
| The table column containing longitude values
 +
| decimallongitude
 +
|-
 +
| LatitudeColumn
 +
| The table column containing latitude values
 +
| decimallatitude
 +
|-
 +
| Z
 +
| Value of Z. Default is 0, that means environmental layers processing will be at surface level or at the first avaliable Z value in the layer
 +
| 0
 +
|-
 +
| TimeIndex
 +
| Time Index. The default is the first time indexed in the input environmental datasets
 +
| 0
 +
|-
 +
| XResolution
 +
| Model projection resolution on the X axis in decimal degrees
 +
| 1
 +
|-
 +
| YResolution
 +
| Model projection resolution on the Y axis in decimal degrees
 +
| 1
 +
|-
 +
|-
 +
| Layers
 +
| The list of environmental layers to use for enriching the points. Each entry is a layer Title or UUID or HTTP link. See section below for further details.
 +
| https://dl.dropboxusercontent.com/u/12809149/wind1.tif
 +
|-
 +
|+ SPD Input Template.
 +
|}
  
 
==SPD Input Format==
 
==SPD Input Format==

Revision as of 11:46, 16 October 2014

Description

This page explains how to use the MaxEnt Algorithm on the Statistical Manager with the i-Marine portal.. The algorithm is hosted by the D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large. that supports i-Marine. It is a Maximum-Entropy model for species habitat modeling, based on the implementation by Shapire et al. v 3.3.3k at Princeton University.

In this adaptation the software accepts a table following the Species Product Discovery service model of i-Marine and a set of environmental layers in various formats (NetCDF, WFSWeb Feature Service, WCSWeb Coverage Service, ASC, GeoTiff) via direct links or GeoExplorer UUIDs.

The user can also set the bounding box and the spatial resolution (in decimal degrees) of the training and the projection. The application will adapt the layers to that resolution if this is higher than the native one.

The output is made up of the following components:

  • a thumbnail map of the projected model,
  • the ROC curve,
  • the Omission/Commission chart,
  • a table containing the raw assigned values,
  • a threshold to transform the table into a 0-1 probability distribution,
  • a report of the importance of the used layers in the model,
  • ASCII representations of the input layers to check their alignment.

Starting from this output, other processes of the Statistical Manager can be later applied to the raw values, for example to produce a GIS map (e.g. the "Statistical Manager Points to Map" process). Eventually, results can be shared with other participants to the e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large. using the i-Marine workspace.

Demo Video

Here is a demonstration of the usage of the MaxEnt algorithm on the Statistical Manager: http://goo.gl/TYYnTO

Inputs

An example of the MaxEnt process configuration

The inputs of the MaxEnt procedure are the following:


Parameter Name Description Example
SpeciesName The name of the species to model and the occurrence records refer to. If the name is not important this can be a generic string. Latimeria chalumnae
MaxIterations The number of learning iterations of the MaxEnt algorithm 1000
DefaultPrevalence A priori probability of presence at ordinary occurrence points. Ref. Santika 2010 0.5
OccurrencesTable A geospatial table containing occurrence records, following the template of the Species Products Discovery datasets. See section below for more details. LatimeriaPointsTable
LongitudeColumn The table column containing longitude values decimallongitude
LatitudeColumn The table column containing latitude values decimallatitude
Z Value of Z. Default is 0, that means environmental layers processing will be at surface level or at the first avaliable Z value in the layer 0
TimeIndex Time Index. The default is the first time indexed in the input environmental datasets 0
XResolution Model projection resolution on the X axis in decimal degrees 1
YResolution Model projection resolution on the Y axis in decimal degrees 1
Layers The list of environmental layers to use for enriching the points. Each entry is a layer Title or UUID or HTTP link. See section below for further details. https://dl.dropboxusercontent.com/u/12809149/wind1.tif
SPD Input Template.

SPD Input Format

The algorithm needs a table to be uploaded on the Statistical Manager. To use the upload facilities, refer to the Statistical Manager Tutorial page. The uploaded table should follow the Species Product Discovery (SPD) template and can be generate by the SPD service:


Field name Format
institutioncode string
collectioncode string
catalognumber string
dataset string
dataprovider string
datasource string
scientificnameauthorship string
identifiedby string
credits string
recordedby string
eventdate timestamp without time zone
modified timestamp without time zone
scientificname string
kingdom string
family string
locality string
country string
citation string
decimallatitude double precision
decimallongitude double precision
coordinateuncertaintyinmeters string
maxdepth double precision
mindepth double precision
basisofrecord string
SPD Input Template.


The fields could be also empty, except for the decimallatitude and decimallongitude fields. This allows to apply MaxEnt also to other domains than species distributions modelling. Note that the template closely follows the Darwin Core format.

Feeding the algorithm with Input Maps

In the layers box users can insert links to maps that will be used to associate environmental values to species occurrence records. The + button allows to insert a new layer.

Input Examples

Example of environmental layers used as inputs

Input from i-Marine GeoExplorer

using the i-Marine GeoExplorer application:

  • search for an environmental layer (e.g. temperature)
  • click on one of the layers found by the search
  • in the "Summary Layer Info" panel on the right scroll down and select the Metadata UUID string (e.g. cd048cb5-dbb6-414b-a3b9-1f3ac512fbff)
  • paste the UUID in the layers box in MaxEnt

Input from a WFSWeb Feature Service link

MaxEnt can import WFSWeb Feature Service links residing either on a GeoServer or on a MapServer. The server must be able to produce maps details in json format. In this case you can insert the direct WFSWeb Feature Service link in the layers box, without specifying the bounding box.

E.g.: http://geoserver-dev.d4science-ii.research-infrastructures.eu/geoserver/ows?service=wfs&version=1.0.0&request=GetFeature&srsName=urn:x-ogc:def:crs:EPSG:4326&TYPENAME=aquamaps:worldborders

Please, use EPSG:4326 as projection.

Input from a WCSWeb Coverage Service link

You can input a direct WCSWeb Coverage Service link.

E.g.:

http://geoserver-dev.d4science-ii.research-infrastructures.eu/geoserver/wcs/wcs?service=wcs&version=1.0.0&request=GetCoverage&coverage=aquamaps:WorldClimBio2&CRS=EPSG:4326&RESPONSE_CRS=EPSG:4326

Please, use EPSG:4326 as projection.

Input from a NetCDF-GRID file

You can input the OpenDAP link to a NetCDF file, only if this contains one single dimension layer.

E.g.:

http://thredds.research-infrastructures.eu/thredds/dodsC/public/netcdf/WOA2005TemperatureAnnual_CLIMATOLOGY_METEOROLOGY_ATMOSPHERE_.nc

ASC ESRI-GRID files

You can input a direct http link to an ESRI-GRID file, even using local common publishing tools (e.g. dropbox).

E.g.:

http://thredds.research-infrastructures.eu/thredds/fileServer/public/netcdf/ph.asc

https://dl.dropboxusercontent.com/u/12809149/layer1.asc

GeoTiffs

Http links to GeoTiff files are allowed, even using local common publishing tools (e.g. dropbox). E.g.:

https://dl.dropboxusercontent.com/u/12809149/wind1.tif

Contacts

For questions and bug alerts use this form.