Difference between revisions of "Catalogue:Applications"

From D4Science Wiki
Jump to: navigation, search
(StatsCube)
(StatsCube)
Line 57: Line 57:
  
 
'''Data Analysis'''
 
'''Data Analysis'''
iMarine excels in offering advanced data analysis facilities to users. The clear separation of data and analytical resources makes it easier to use analytical tools. The infrastructure is used to store the data, and no complicated steps are needed other than to select and filter the data, and load these to the required analytical environment. For analysis, several environments are proposed, ranging from a bare-bone R-studio, paralellized R-servers, VRE-based analytical and predictive algorithms such as AquaMaps, to the Statistical manager, where users can integrate their own logic. This logic can exploit infrastruture computing resourcces, or interact with exteranl Cloud or Hadoop clusters. With iMarine, the treshod for exploiting such resources is lowered considerably, making them accessible to a much wider, geographically disperesed EA-CoP.  
+
iMarine excels in offering advanced data analysis facilities to users. The clear separation of data and analytical resources makes it easier to use analytical tools. The infrastructure is used to store the data, and no complicated steps are needed other than to select and filter the data, and load these to the required analytical environment. For analysis, several environments are proposed, ranging from a bare-bone R-studio, parallelized R-servers, VRE-based analytical and predictive algorithms such as AquaMaps, to the Statistical manager, where users can integrate their own logic. This logic can exploit infrastructure computing resources, or interact with external Cloud or Hadoop clusters. With iMarine, the threshold for exploiting such resources is lowered considerably, making them accessible to a much wider, geographically dispersed EA-CoP.  
  
 
'''Data reporting'''
 
'''Data reporting'''

Revision as of 16:51, 23 July 2013

The D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. / iMarine infrastructure combines the functionality of more than 500 components into a coherent and centrally managed infrastructure of hardware, software, and data resources. Together, these offer a platform that can host a variety of applications. These applications share a common theme; Provide a service to a Community of PracticeA term coined to capture an "activity system" that includes individuals who are united in action and in the meaning that "action" has for them and for the larger collective. The communities of practice are "virtual", ''i.e.'', they are not formal structures, such as departments or project teams. Instead, these communities exist in the minds of their members, are glued together by the connections they have with each other, as well as by their specific shared problems or areas of interest. The generation of knowledge in communities of practice occurs when people participate in problem solving and share the knowledge necessary to solve the problems.. Other than other infrastructures that boast size, power, performance, or latest technology, D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. puts the community first. In the context of iMarine, this is taken even further, quite literally, as the Ecosystem Approach Community of PracticeA term coined to capture an "activity system" that includes individuals who are united in action and in the meaning that "action" has for them and for the larger collective. The communities of practice are "virtual", ''i.e.'', they are not formal structures, such as departments or project teams. Instead, these communities exist in the minds of their members, are glued together by the connections they have with each other, as well as by their specific shared problems or areas of interest. The generation of knowledge in communities of practice occurs when people participate in problem solving and share the knowledge necessary to solve the problems. is spread around the globe. No other infrastructure equals iMarine in developing support the real-life scenarios overcoming 'low' hurdles; low resources, low training, low connectivity, low data quality. We are glad to leave the high hurdles to specialists, we rather serve communities that work to achieve the UN Millennium Development Goals. This does not imply we make concessions on quality or performance, but we see it as our mission to offer quality and performance to communities that have no resources of their own to jump the hurdles.

The infrastructure resembles an archipelago where applications emerge as islands of services, resting on an underlying infrastructure bedrock. The islands specialize in one or more domains, yet are not isolated 'atolls'. Every island is well connected to others, and island-hopping is strongly encouraged. Each island offers a standard set of features that can be extended by selecting services from several topical bundles.

The iMarine infrastructure currently offers 4 main domain bundles that can be customized and / or enriched into flexible, purpose-built applications. Each application in the infrastructure is tightly integrated with the underlying gCube enabling software, and can access and re-purpose data from other iMarine applications.

FYI Examples of other offers

Through the enabling environment of gCube, all users benefit from Infrastructure Services,

The 4 key-applications that iMarine has delivered and continues to enrich are:

  • BiolCube; focuses on the management and interpretation of biological and ecological data in the environment.
  • StatsCube; a complete full life-cycle data framework, from observational data to aggregated data repositories enriched with validation and analytical tools.
  • GeosCube; tightly connected to the BiolCube, the framework, based on OGC compliant tools and services manage the storage and interpretation of geospatial explicit information, including WPS processing.
  • PoliCube; brings semantic technologies for publishing structured data so that it can be interlinked and become more useful to end-users, enabling them to produce LOD, to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.

The bundle approach, by itself an abstraction over a host of services, is expected to offer more 'flavors' in the near future. For instance a focused approach on organizing access to computing resources, or a support infrastructure for Mobile Apps are foreseen:

  • IceCube; will offer access to infrastructure, cloud and grid based computing resources for dummies.
  • AppsCube; will offer an integrated approach to mobile app development. The infrastructure organizes the content and data-exchange with mobile apps, Please note that the App itself is not developed with iMarine, rather it relies on the infrastructure to maintain and manage the data collected with and exposed through this App.

BiolCube

BiolCube is available as a suite that packs many useful features in one environment to have a complete work-space for biologists working with occurrence data and reviewing species names. It offers services in two main areas; taxonomic and occurrence data discovery and management, and modeling and analysis of distribution data:

Taxonomic and occurrence data discovery and management

  • Occurrence data finder; download public datasets from world class biodiversity repositories to your private environment. In this private environment you can prepare data sets for use in further analysis with iMarine tools for data sanitation, filtering, merging and duplicate detection. Occurrence data can be directly visualized on maps using the geo-explorer, downloaded in several formats, and shared with / send to other environments.
  • Species name finder; not sure about a species name? Then iMarine offers tools to search, download and verify taxonomic and vernacular names of marine species.
  • Species name matcher; correcting spelling mistakes or incomplete names can be very time-consuming. With iMarine tools you can validate the names of species names in your data to ensure they comply with the standard of your choice. iMarine offer powerful matching and reconciliation services, already in use at FAO, to identify close matches the names in your datasets. The infrastructure makes several key reference datasets available for consultation and reconciliation. These include the FAO ASFIS species list, and WoRMS register of marine species. If you wish, you can add your own reference list.
  • Environmental enrichment of data. In a shared service with GeosCube, this service adds environmental information to occurrence data to improve their quality and usefulness in modeling and analytical exercises. The service allows to obtain an estimate of a range of dynamically computed environmental parameters such as water temperature, ocean color, salinity, argonite, or BOD. The services can identify the nearest observations in space and time, and will return a computed average or nearest observation that can be added to an observation. The iMarine innovative tools allow to specify what the 'nearest' means; i.e. a distance, a distance over a gradient, a seasonal average, or a depth range.

Modeling and analysis of distribution data

  • Biodiversity mapping tools. The first iMarine species distribution and biodiversity mapping tools enabled the production of the well-known AquaMaps. With iMarine, the generation became faster, more robust, and results are shared in a collaborative environment. In addition to AquaMaps, many other biodiversity analytical and predictive tools are available. These include the toolset of OpenModeler and custom build Neural Network driven analytical services.
  • Species fact-sheets generator. With scientist spread over the globe, generating consistent information sheets on marine species is no sinecure. That is why the FishFinderVRE was designed. It offers a complete templating and reporting work-flow operated by scientists, for scientists. The results, species fact-sheets, can be disseminated in a variety of formats.
  • Trend-analysis of data. In a shared service with StatsCube, Trendylyzer offers services to identify and vizualize trends in time-series of data. Trendylyzer was developed to specifically address skewness and gaps in datasets.
  • Spatial analysis of data. In a shared service with StatsCube, clustering, probability, and other spatial analytical features.

BiolCube is an independent yet not isolated bundle of specialized services for biologist. Well embedded in the iMarine e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large., it provides access to auxiliary services that turn BiolCube in a multi-purpose toolbox for biodiversity data analysis. iMarine enables a near-seamless access to powerful statistical analysis software through StatsCube, advanced plotting and geospatial data production through GeosCube.

With BiolCube and StatsCube services combined, developers are now working to develop an integrated environment where species distribution can be studied in space and over time, with occurrence data analyzed using measured environmental observations, rather than estimated large scale average values.

If you wish to learn more about using BiolCube or specific services, please contact us.

StatsCube

Example 'competitor' GSIM

StatsCube offers a complete data suite to manage the entire data-cycle from collection to archiving. With iMarine technologies exiting new capabilities are added to the life-cycle management of especially time-series data. StatsCube is developed using state-of-the-art OpenSource components that are brought together in a managed infrastructure. This enables a very cost-effective offer to resource poor institutes in need of sophisticated data services. Other benefits are the availability of shared services for reference data management, and harmonization of data repository services.

StatsCube relies on continued support and ongoing development of a bundle of service. This bundle offers services that together support a complete life-cycle for statistical data, but can also connect to services offered through other bundles to establish a network of cross-domain services.

StatsCube offers a set of services that VREVirtual Research Environment. managers select to compose one or more VREVirtual Research Environment.'s, and who can access such services. This allows for a fine-grained approach to sometimes complex data-workflows, where data flow from detailed field level data through several aggregation and review stages until an summary statistics can be produced. At each stage of such work-flow, other resources can be mobilized in support of specific activities such as geo-referencing, enrichment with environmental data, statistical modeling or analysis. With Statscube, iMarine implements key data services:

Data Work-flow If you need to manage data-flows, iMarine offers a life-cycle support where data enter the system as observations or batch data, and can then be validated and harmonized before being added to a repository. Not only are data well described by metadata during this process, but also the processing steps are captured as process metadata. The entire process is under the control of a 'visor' that protect the data from unauthorized access and modifications.

Data Analysis iMarine excels in offering advanced data analysis facilities to users. The clear separation of data and analytical resources makes it easier to use analytical tools. The infrastructure is used to store the data, and no complicated steps are needed other than to select and filter the data, and load these to the required analytical environment. For analysis, several environments are proposed, ranging from a bare-bone R-studio, parallelized R-servers, VREVirtual Research Environment.-based analytical and predictive algorithms such as AquaMaps, to the Statistical manager, where users can integrate their own logic. This logic can exploit infrastructure computing resources, or interact with external Cloud or Hadoop clusters. With iMarine, the threshold for exploiting such resources is lowered considerably, making them accessible to a much wider, geographically dispersed EA-CoPCommunity of Practice..

Data reporting


For more information on getting started with and using StatsCube, the iMarine website offers many resources. You can also register here to experience some of the components.

Examples of StatsCube implementations are

  • ICIS; for the collection and
  • Tuna Atlas;
  • TimeSeries Environment;

GeosCube

GeosCube is the iMarine answer to the large and complex issue of understanding fisheries and biodiversity data in the spatial domain. Through GeosCube, spatial services are offered to consumers of the iMarine infrastructure, be they other iMarine tools or VREVirtual Research Environment.'s, or external organizations wishing to use iMarine's web-services.

Through GeosCube iMarine aims to offer an INSPIRE directive compliant toolset that will enable the generation and management of geospatial explicit data for practioners who have no resources to develop and maintain their own spatial data infrastructure. From the onset of iMarine GeosCube was seen as a service provider to several business cases, and not as a complete Spatial Data Infrastructure. The GeosCube bundles a range of OGC compliant resources that can be either made available in it's entirety, or as a selection of services that can be mounted is a customized environment, such a VREVirtual Research Environment.. The bundle currently offers several tools:

  1. Data discovery, access, and vizualization
  2. Compare maps, download and share. Enrich datasets emanating from other bundles with env. info
  3. analyze geosatial information. transects, mathematical comparison, modeling

OGC-based WXS environment;

  • WMSSee Workload Management System or Web Mapping Service.
  • WFSWeb Feature Service
  • WCSWeb Coverage Service
  • WPS

GeosCube services are made available through iMarine portlets, VREVirtual Research Environment.'s, remote services and OGC compliant tools for discovery and access. These can be either accessed as individual components or services (see the detailed descriptions here), or pre-configured in a bundle that supports a range of services. Some examples of such bundles include:

  • GeoExplorer;
  • GisViewer;

Smaller components that leverage a specific task at infrastructure level are:

  • Apache Hadoop MapReduce to offer OGC Web Processing Service (WPS) through a web interface for the dynamic deployment of geospatial processes;
  • TiffUploader Algorithm: a visualization purpose process to upload each layer from a map file, to a GeoServer WMSSee Workload Management System or Web Mapping Service. instance;
  • Intersection Algorithm: a simple process based on the 52 North WPS algorithm, to make an intersection of two Polygons in input;
  • Resampler Algorithm: a process that performs a resampling of a geospatial layer in netCDF-CF.

For users that only need a dataset, search and retrieve facilites are available

  • GeoNetwork
  • THREDDS
  • ..

Example datasets that can be discovered already in the iMarine infrastructure are:

  • Species distribution map-products,
  • Species occurrence geospatial datasets (KML / GML)

PoliCube

The primary aim of PoliCube is to deliver information to policy makers from a variety of sources as an integrated view generated using a variety of approaches, including semantic technologies.

PoliCube offers flexible reporting, search and retrieval, aggregation and projection facilities. These are primarily offered as data-driven indicators and topical fact sheets. These facilities can only be effective if a modern toolset is available to enrich or annotate existing data with relevant information in the form of e.g. uri's. This

The use of an infrastructure enables to focus on the needs of policy makers, that need to rely on dynamic reports, extracted near-real time from data coming in from multiple directions, and with varying quality and accessiblity policies attached to these data flows.

  1. Organizational features of iMarine; Workspace, messaging, mailing, user management
  2. Social tool
  3. Semantic search and factsheets
  4. Plugins for remote information (OAI, OpenSearch)

IceCube

A key benefit of iMarine is the ease to set up scalable data processing solutions. A scalable solution may be needed because you have to manage any combination of a lot of users, a lot of data, a lot of processing, and a lot of new functionality. This requires expertise that is usually not found in one place. An infrastructure can offer more than one solution; offering a dedicated computing environment, parallelization, access to a grid or cloud environment, or outsourcing computations to external infrastructures are all options to consider. With iMarine expertise, you can ask for a technology solution, where several options can be discussed. the services available on demand can be separeated in several categories:

  • Manage administrative scalability
    • Manage users
    • Manage virtual Organizations
  • Manage Functionality
  • Manage Load scalability
  • Manage geographic scalability
    • Keep your data an processes together to reduce
    • Bring your computation to your data to reduce bandwith use

AppsCube

The quickly growing use of mobile apps for data collection and dissemination requires that content and reference data are managed from an integrated data perspective. With ever more versatile and demanding apps, data often cannot be kept in one central repository that fits all sizes. Very often, apps mash up data from e.g. geospatial and statistical data resources, or when collection data, rely on constantly updated refrence data, such as on names of species, vessel characteristics, or local reporting requiremetns.

To manage a multitude of data collection and dissemination apps, an infrastructure that offers ...

In iMarine