Difference between revisions of "Catalogue:Applications"

From D4Science Wiki
Jump to: navigation, search
(StatsCube)
 
(56 intermediate revisions by 6 users not shown)
Line 1: Line 1:
The D4Science / iMarine infrastructure combines the functionality of more than 500 components into a coherent and centrally managed infrastructure of hardware, software, and data resources. Together, these offer a platform that can host a variety of applications. These applications share a common theme; Provide a service to a Community of Practice. Other than other infrastructures that boast size, power, performance, or latest technology, D4Science puts the community first. In the context of iMarine, this is taken even further, quite literally, as the Ecosystem Approach Community of Practice is spread around the globe. No other infrastructure equals iMarine in developing support the real-life scenarios overcoming 'low' hurdles; low resources, low training, low connectivity, low data quality. We are glad to leave the high hurdles to specialists, we rather serve communities that work to achieve the UN Millennium Development Goals. This does not imply we make concessions on quality or performance, but we see it as our mission to offer quality and performance to communities that have no resources of their own to jump the hurdles.
+
{| align="right"
 +
|| __TOC__
 +
|}
 +
The D4Science infrastructure combines the functionality of more than 500 components into a coherent and centrally managed infrastructure of hardware, software, and data resources. Together, these offer a platform that can host a variety of applications. These applications share a common theme; Provide a service to a Community of Practice. Other than other infrastructures that boast size, power, performance, or the latest technology, D4Science puts the community first. This does not imply we make concessions on quality or performance, but we see it as our mission to offer quality and performance to communities that have no resources of their own to jump high hurdles.
  
 
The infrastructure resembles an archipelago where applications emerge as islands of services, resting on an underlying infrastructure bedrock. The islands specialize in one or more domains, yet are not isolated 'atolls'. Every island is well connected to others, and island-hopping is strongly encouraged. Each island offers a standard set of features that can be extended by selecting services from several topical bundles.  
 
The infrastructure resembles an archipelago where applications emerge as islands of services, resting on an underlying infrastructure bedrock. The islands specialize in one or more domains, yet are not isolated 'atolls'. Every island is well connected to others, and island-hopping is strongly encouraged. Each island offers a standard set of features that can be extended by selecting services from several topical bundles.  
  
The iMarine infrastructure currently offers 4 main domain bundles that can be customized and / or enriched into flexible, purpose-built applications. Each application in the infrastructure is tightly integrated with the underlying gCube enabling software, and can access and re-purpose data from other iMarine applications.  
+
The infrastructure currently offers 4 main domain bundles that can be customized and/or enriched into flexible, purpose-built applications. Each application in the infrastructure is tightly integrated with the underlying gCube enabling software, and can access and re-purpose data from other applications.  
  
FYI Examples of other offers
+
Through the enabling environment of [https://gcube.wiki.gcube-system.org/gcube/index.php/GCube_Wiki gCube], all users benefit from [[Catalogue:Infrastructure | Infrastructure Services]], but where to start?
* [http://www.cloudera.com/content/cloudera/en/products.html Cloudera]
+
For new users, D4Science offers several domain-oriented solutions for 4 categories of users: data managers and analysts, biologists, spatial data managers, and policy oriented 'omnivores'.
* [http://virtuoso.openlinksw.com/offers/ Virtuoso]
+
For each of these, a bundle of relevant [https://gcube.wiki.gcube-system.org/gcube/index.php/GCube_Wiki gCube software components] is available in a 'Cube'.
 +
This bundle can be limited to receive (and pay for) only those resources actually needed or consumed.
 +
A bundle can also be extended with resources coming from other bundles; our aim is to offer bundles characterized by the domain tools and not by domain boundaries.
 +
In our experience, most experts rather manage their information in a bundle of domain specific software and are only consumers of data from other bundles.
 +
Thus, in most use scenarios, a user would be a data manager in a bundle, but only a consumer in another.
  
Through the enabling environment of gCube, all users benefit from [http://wiki.i-marine.eu/index.php/Catalogue:Infrastructure Infrastructure Services], 
+
The 4 key-applications that D4Science has delivered and continues to enrich are:
  
The 4 key-applications that iMarine has delivered and continues to enrich are:
+
{|
* '''BiolCube'''; focuses on the management and interpretation of biological and ecological data in the environment.
+
|-
* '''StatsCube'''; a complete full life-cycle data framework, from observational data to aggregated data repositories enriched with validation and analytical tools.
+
|| [[File:BiolCube.png|100px]]
* '''GeosCube'''; tightly connected to the BiolCube, the framework, based on OGC compliant tools and services manage the storage and interpretation of geospatial explicit information, including WPS processing.
+
|| '''[[#BiolCube | BiolCube]]'''; focuses on the management and interpretation of biodiversity data.
* '''PoliCube'''; brings semantic technologies for publishing structured data so that it can be interlinked and become more useful to end-users, enabling them to produce LOD, to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.
+
|-
 +
|| [[File:StatsCube.png|100px]]
 +
|| '''[[#StatsCube | StatsCube]]'''; a complete full life-cycle data framework, from observational data to aggregated data repositories enriched with validation and analytical tools.
 +
|-
 +
|| [[File:GeosCube.png|100px]]
 +
|| '''[[#GeosCube | GeosCube]]'''; tightly connected to the BiolCube, the framework, based on OGC compliant tools and services manage the storage and interpretation of geospatial explicit information, including WPS processing.
 +
|-
 +
|| [[File:ConnectCube.png|100px]]
 +
|| '''[[#ConnectCube | ConnectCube]]'''; brings semantic technologies for publishing structured data so that it can be interlinked and become more useful to end-users, enabling them to produce LOD, to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.
 +
|}
  
The bundle approach, by itself an abstraction over a host of services, is expected to offer more 'flavors' in the near future. For instance a focused approach on organizing access to computing resources, or a support infrastructure for Mobile Apps are foreseen:  
+
The bundle approach, by itself an abstraction over a host of services, is expected to offer more 'flavors' in the near future.  
* '''IceCube'''; will offer access to infrastructure, cloud and grid based computing resources for dummies.
+
For instance, a focused approach for infrastructure support for Mobile Apps is foreseen:  
* '''AppsCube'''; will offer an integrated approach to mobile app development. The infrastructure organizes the content and data-exchange with mobile apps, Please note that the App itself is not developed with iMarine, rather it relies on the infrastructure to maintain and manage the data collected with and exposed through this App.  
+
 
 +
* '''[[#AppsCube | AppsCube]]'''; offers an integrated approach to mobile app development. The infrastructure organizes the content and data-exchange with mobile apps, Please note that the App itself is not developed with D4Science, rather it relies on the infrastructure to maintain and manage the data collected with and exposed through this App.
 +
* '''[[#IceCube | IceCube]]'''; An Integrated Computing Environment offers access to infrastructure, cloud computing resources "as a Service". An instance of such Cube will offer users access to predefined data and algorithms that can be applied to these data.
  
 
== BiolCube ==
 
== BiolCube ==
  
'''BiolCube''' is available as a suite that packs many useful features in one environment to have a complete work-space for biologists working with occurrence data and reviewing species names. It offers services in two main areas; taxonomic and occurrence data discovery and management, and modeling and analysis of distribution data:  
+
'''BiolCube''' is available as a suite that packs many useful features in one research environment where marine ecologists are offered a complete private work-space to manage species names and occurrence data, the main areas where '''BiolCube''' offers services:  
  
 
'''Taxonomic and occurrence data discovery and management'''
 
'''Taxonomic and occurrence data discovery and management'''
* Occurrence data finder; download public datasets from world class biodiversity repositories to your private environment. In this private environment you can prepare data sets for use in further analysis with iMarine tools for data sanitation, filtering, merging and duplicate detection. Occurrence data can be directly visualized on maps using the geo-explorer, downloaded in several formats, and shared with / send to other environments.  
+
* Occurrence data finder: Download public datasets from world-class biodiversity occurrence data repositories to your private environment where you can prepare datasets for use in further analysis with D4Science tools for data curation, filtering, merging and duplicate detection. Occurrence data can be directly visualized on maps using the geo-explorer, downloaded in several formats, and shared with / send to other environments.  
* Species name finder; not sure about a species name? Then iMarine offers tools to search, download and verify taxonomic and vernacular names of marine species.  
+
* Species name finder: Not sure about a species name? Then D4Science offers tools to search, download and verify taxonomic and vernacular names of marine species.  
* Species name matcher; correcting spelling mistakes or incomplete names can be very time-consuming. With iMarine tools you can validate the names of species names in your data to ensure they comply with the standard of your choice. iMarine offer powerful matching and reconciliation services, already in use at FAO, to identify close matches the names in your datasets. The infrastructure makes several key reference datasets available for consultation and reconciliation. These include the FAO ASFIS species list, and WoRMS register of marine species. If you wish, you can add your own reference list.  
+
* Species name matcher: Correcting spelling mistakes or incomplete names can be very time-consuming. With D4Science tools you can validate the names of species names in your data to ensure they comply with the standard of your choice. D4Science offer powerful matching and reconciliation services, already in use at FAO, to identify close matches the names in your datasets. The infrastructure makes several key reference datasets available for consultation and reconciliation. These include the FAO ASFIS species list, FishBase for finfishes, and WoRMS the World Register of Marine Species. If you wish, you can add your own reference list.  
* Environmental enrichment of data. In a shared service with GeosCube, this service adds environmental information to occurrence data to improve their quality and usefulness in modeling and analytical exercises. The service allows to obtain an estimate of a range of dynamically computed environmental parameters such as water temperature, ocean color, salinity, argonite, or BOD. The services can identify the nearest observations in space and time, and will return a computed average or nearest observation that can be added to an observation. The iMarine innovative tools allow to specify what the 'nearest' means; i.e. a distance, a distance over a gradient, a seasonal average, or a depth range.  
+
* Environmental enrichment of data: In a shared service with [[#GeosCube | GeosCube]], this service adds environmental information to occurrence data to improve their quality and usefulness in modelling and analytical exercises. The service allows obtaining an estimate of a range of dynamically computed environmental parameters such as water temperature, ocean color, salinity, aragonite content, or BOD. The services can identify the nearest observations in space and time and will return a computed average or nearest observation that can document an occurrence. The D4Science innovative tools allow to specify what the 'nearest' means; i.e. a distance, a distance over a gradient, a seasonal average, or a depth range.  
  
 
'''Modeling and analysis of distribution data'''
 
'''Modeling and analysis of distribution data'''
* Biodiversity mapping tools. The first iMarine species distribution and biodiversity mapping tools enabled the production of the well-known AquaMaps. With iMarine, the generation became faster, more robust, and results are shared in a collaborative environment. In addition to AquaMaps, many other biodiversity analytical and predictive tools are available. These include the toolset of OpenModeler and custom build Neural Network driven analytical services.  
+
* Biodiversity mapping tools: The first D4Science species distribution and biodiversity mapping tool enabled the production of the well-known AquaMaps. With D4Science, the generation became faster, more robust, and results are shared in a collaborative environment. In addition to AquaMaps, many other biodiversity analytical and predictive tools are available. These include the toolset of OpenModeler and custom build Neural Network driven analytical services.  
* Species fact-sheets generator. With scientist spread over the globe, generating consistent information sheets on marine species is no sinecure. That is why the FishFinderVRE was designed. It offers a complete templating and reporting work-flow operated by scientists, for scientists. The results, species fact-sheets, can be disseminated in a variety of formats.
+
* Species fact-sheets generator: With scientists spread over the globe, generating consistent information sheets on marine species is no sinecure. That is why the [https://i-marine.d4science.org/group/fishfindervre FishFinderVRE] was designed. It offers a complete templating and reporting work-flow operated by scientists, for scientists. The results, species fact-sheets, can be disseminated in a variety of formats, inp articular those established by FAO for its now famous species and regional catalogues, field guides, and the more recent pocket guides.
* Trend-analysis of data. In a shared service with StatsCube, Trendylyzer offers services to identify and vizualize trends in time-series of data. Trendylyzer was developed to specifically address skewness and gaps in datasets.   
+
* Trend-analysis of data: In a shared service with [[#StatsCube | StatsCube]], Trendylyzer offers services to identify and visualize trends in time-series of data. Trendylyzer was developed to specifically address skewness and gaps in datasets.   
* Spatial analysis of data. In a shared service with StatsCube, clustering, probability, and other spatial analytical features.  
+
* Spatial analysis of data: In a shared service with [[#StatsCube | StatsCube]], clustering, probability, and other spatial analytical features.  
  
BiolCube is an independent yet not isolated bundle of specialized services for biologist. Well embedded in the iMarine e-Infrastructure, it provides access to auxiliary services that turn BiolCube in a multi-purpose toolbox for biodiversity data analysis. iMarine enables a near-seamless access to powerful statistical analysis software through StatsCube, advanced plotting and geospatial data production through GeosCube.  
+
BiolCube is an independent yet not isolated bundle of specialized services for marine ecologists and natural aquatic resource managers. Well embedded in the D4Science e-Infrastructure, it provides access to auxiliary services that turn BiolCube in a multi-purpose toolbox for biodiversity data analysis. D4Science enables near-seamless access to powerful statistical analysis software through [[#StatsCube | StatsCube]], advanced plotting and geospatial data production through [[#GeosCube | GeosCube]].  
  
With BiolCube and StatsCube services combined, developers are now working to develop an integrated environment where species distribution can be studied in space and over time, with occurrence data analyzed using measured environmental observations, rather than estimated large scale average values.  
+
With BiolCube and [[#StatsCube | StatsCube]] services combined, developers are now working to develop an integrated environment where species distribution can be studied in space and over time, with occurrence data analyzed using measured environmental observations, rather than estimated large scale average values.  
 +
 
 +
The services that are most characteristic of this bundle are:
 +
* [https://gcube.wiki.gcube-system.org/gcube/index.php/Biodiversity_Access Species Product Discovery service]
 +
* [https://gcube.wiki.gcube-system.org/gcube/index.php/Occurrence_Data_Reconciliation Occurrence Data Reconciliation]
 +
* [https://gcube.wiki.gcube-system.org/gcube/index.php/Occurrence_Data_Enrichment_Service Occurrence Data Enrichment Service]
 +
* [https://gcube.wiki.gcube-system.org/gcube/index.php/Taxon_Names_Reconciliation_Service Taxon Names Reconciliation Service]
  
 
If you wish to learn more about using BiolCube or specific services, please contact us.
 
If you wish to learn more about using BiolCube or specific services, please contact us.
Line 45: Line 69:
 
== StatsCube ==
 
== StatsCube ==
  
Example 'competitor' [http://www1.unece.org/stat/platform/pages/viewpage.action?pageId=59703371 GSIM]
+
'''StatsCube''' offers a complete data suite to manage the entire data cycle from collection to archiving. With D4Science technologies exciting new capabilities are added to the life-cycle management and analysis of especially time-series data. StatsCube is developed using state-of-the-art OpenSource components that are brought together in a managed infrastructure. This enables a very cost-effective offer to resource-poor institutes in need of sophisticated data services. Other benefits are the availability of shared services for reference data management, and harmonization of data repository services.  
 
+
''StatsCube'' offers a complete data suite to manage the entire data-cycle from collection to archiving. With iMarine technologies exiting new capabilities are added to the life-cycle management of especially time-series data. StatsCube is developed using state-of-the-art OpenSource components that are brought together in a managed infrastructure. This enables a very cost-effective offer to resource poor institutes in need of sophisticated data services. Other benefits are the availability of shared services for reference data management, and harmonization of data repository services.  
+
  
 
StatsCube relies on continued support and ongoing development of a bundle of service. This bundle offers services that together support a complete life-cycle for statistical data, but can also connect to services offered through other bundles to establish a network of cross-domain services.  
 
StatsCube relies on continued support and ongoing development of a bundle of service. This bundle offers services that together support a complete life-cycle for statistical data, but can also connect to services offered through other bundles to establish a network of cross-domain services.  
  
StatsCube offers a set of services that VRE managers select to compose one or more VRE's, and who can access such services. This allows for a fine-grained approach to sometimes complex data-workflows, where data flow from detailed field level data through several aggregation and review stages until an summary statistics can be produced. At each stage of such work-flow, other resources can be mobilized in support of specific activities such as geo-referencing, enrichment with environmental data, statistical modeling or analysis. With Statscube, iMarine implements key data services:
+
The StatsCube bundle offers a set of services available to VRE managers. They can select from this bundle to compose one or more VRE's, and decide who can access such services. This allows for a fine-grained approach to sometimes complex data-workflows, where data flow from detailed field level data through several aggregation and review stages until summary statistics can be produced. At each stage of such work-flow, other resources can be mobilized in support of specific activities such as geo-referencing, enrichment with environmental data, statistical modelling or analysis. With StatsCube, D4Science implements key data services:
  
 
'''Data Work-flow'''
 
'''Data Work-flow'''
If you need to manage data-flows, iMarine offers a life-cycle support where data enter the system as observations or batch data, and can then be validated and harmonized before being added to a repository. Not only are data well described by metadata during this process, but also the processing steps are captured as process metadata. The entire process is under the control of a 'visor' that protect the data from unauthorized access and modifications.  
+
If you need to manage data-flows, D4Science offers life-cycle support where data enter the system as observations or batch data, and can then be harmonized and validated before being added to a repository. Not only are data well described by metadata during this process, but also the processing steps are captured as process metadata. The entire process is under the control of a 'visor' that protect the data from unauthorized access and modifications.
 +
The harmonization can rely on powerful matching features that enable to establish matches between datasets that would be very time-consuming to establish manually. Just as one would expect in a work-flow, the matching results are kept for re-use and reference. The matching is usually performed against a (long) code list, that is fully managed through the D4Science infrastructure. A specialized code list manager enables the ingestion (of existing SDMX code lists), creation, and maintenance of reference lists.
  
 
'''Data Analysis'''
 
'''Data Analysis'''
iMarine excels in offering advanced data analysis facilities to users. The clear separation of data and analytical resources makes it easier to use analytical tools. The infrastructure is used to store the data, and no complicated steps are needed other than to select and filter the data, and load these to the required analytical environment. For analysis, several environments are proposed, ranging from a bare-bone R-studio, parallelized R-servers, VRE-based analytical and predictive algorithms such as AquaMaps, to the Statistical manager, where users can integrate their own logic. This logic can exploit infrastructure computing resources, or interact with external Cloud or Hadoop clusters. With iMarine, the threshold for exploiting such resources is lowered considerably, making them accessible to a much wider, geographically dispersed EA-CoP.  
+
D4Science excels in offering advanced data analysis facilities to users. The clear separation of data and analytical resources makes it also easy to work with these analytical tools. The infrastructure stores the data, and no complicated steps are needed other than to select and filter the datasets and load these to the required analytical environment. For analysis, several environments are proposed, ranging from a bare-bone R-studio, parallelized R-servers, VRE-based analytical and predictive algorithms such as AquaMaps, to the Statistical manager, where users can integrate their own logic. This logic can exploit infrastructure computing resources, or interact with external Cloud or Hadoop clusters. With D4Science, the threshold for exploiting such resources is lowered considerably, making them accessible to a much wider, geographically dispersed EA-CoP.
 +
Examples of analytical features implemented in D4Science are:
 +
* Tools include R, WPS, Hadoop, WEKA data mining and access to Cloud resources;
 +
* Algorithms in the statistical service include DBSCAN, Neurological Networks, Clustering, and trend analysis.
  
'''Data reporting'''
+
'''Data reporting and visualization'''
 +
After a dataset has been added to the infrastructure, or once an analysis has been performed, the results are available in the same infrastructure to enrich reports, repositories or other infrastructure resources that can access them.
 +
Datasets are easily enriched and re-used in sometimes surprising new contexts. Some advanced facilities to work with statistical data are:
 +
* geo-referencing time-series, and display these on maps;
 +
* include time-series in reports;
 +
* data-graphs;
 +
* infrastructure services for download, sharing and sending datasets.
  
 
+
A few key services of this bundle are:
For more information on getting started with and using StatsCube, the iMarine website offers many resources. You can also register here to experience some of the components.
+
* Tabular Data
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Tabular_Data_Flow_Manager Tabular Data Flow Manager]
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Tabular_Data_Manager Tabular Data Manager]
 +
* Time Series
 +
**[https://gcube.wiki.gcube-system.org/gcube/index.php/TimeSeries TimeSeries Manager]
 +
**[https://gcube.wiki.gcube-system.org/gcube/index.php/Codelist_Manager CodeList Manager]
 +
* Data manipulation, mining and modelling
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Data_Transformation_Service_Specification  Data Transformation Service]
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Geospatial_Data_Processing  WPS-Hadoop Service]
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Legacy_applications_integration Legacy Application Integration]
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Statistical_Manager Statistical Manager]
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Ecological_Modeling Ecological Modeling]
 +
** [https://gcube.wiki.gcube-system.org/gcube/index.php/Signal_Processing Signal Processing]
  
 
Examples of StatsCube implementations are
 
Examples of StatsCube implementations are
* ICIS; for the collection and  
+
* ICIS; a complete solution for the collection and dissemination of fisheries capture data.
* Tuna Atlas;
+
* Tuna Atlas; a focused ICIS implementation, with extended mapping capabilities provided through GeosCube.
* TimeSeries Environment;
+
* TimeSeries Environment; An open free-to-use private solution of ICIS.
 +
* Trendylyzer; A trend-analysis toolkit for time series that have evolved over time, and have incorporated inconsistencies, gaps, and discrepancies. Trendylyzer employs a range of mining and manipulation techniques to first prepare a harmonized data-set, and then discover trends, if the data allow.
  
 
== GeosCube ==
 
== GeosCube ==
GeosCube is the iMarine answer to the large and complex issue of understanding fisheries and biodiversity data in the spatial domain. Through GeosCube, spatial services are offered to consumers of the iMarine infrastructure, be they other iMarine tools or VRE's, or external organizations wishing to use iMarine's web-services.  
+
'''GeosCube''' is the D4Science answer to the large and complex issue of understanding fisheries and biodiversity data in the spatial domain. Through GeosCube, spatial services are offered to consumers of the infrastructure, be they other D4Science tools or VRE's, or external organizations wishing to use D4Science web-services.  
  
Through GeosCube iMarine aims to offer an INSPIRE directive compliant toolset that will enable the generation and management of geospatial explicit data for practioners who have no resources to develop and maintain their own spatial data infrastructure. From the onset of iMarine GeosCube was seen as a service provider to several business cases, and not as a complete Spatial Data Infrastructure. The GeosCube bundles a range of OGC compliant resources that can be either made available in it's entirety, or as a selection of services that can be mounted is a customized environment, such a VRE. The bundle currently offers several tools:
+
Through GeosCube D4Science aims to offer an INSPIRE directive compliant bundle of services that will enable the generation and management of geospatial explicit data for practitioners who have no resources to develop and maintain their own spatial data infrastructure. From the onset of D4Science GeosCube was seen as a service provider to several business cases. The set of services, standards and protocols that together comprise the bundle rely on W*Ss, GeoNetwork, GeoServer, and THREDDS. In D4Science a catalogue is implemented using the CS-W protocol through a GeoNetwork service. The GeosCube bundles a range of OGC compliant resources that can be either made available in it's entirety or as a selection of services that can be mounted in a customized environment, such a VRE.  
 +
These VRE's are vertically integrated, and horizontally interoperable. They rest on the gCube infrastructure, and are thus managed through a well-defined environment, while at the same time seamlessly benefit from data and processing resources made available through that infrastructure.
  
# Data discovery, access, and vizualization
+
GeosCube bridges the gap between powerful infrastructure-based geospatial tools and data, and lightweight web map solutions with limited processing capacity. It thus enables the use of these powerful tools for resource limited users and organizations.  
# Compare maps, download and share. Enrich datasets emanating from other bundles with env. info
+
# analyze geosatial information. transects, mathematical comparison, modeling
+
  
OGC-based WXS environment;
+
GeosCube bundles the tools to:
* WMS
+
* WFS
+
* WCS
+
* WPS
+
  
GeosCube services are made available through iMarine portlets, VRE's, remote services and OGC compliant tools for discovery and access. These can be either accessed as individual components or services (see the detailed descriptions here), or pre-configured in a bundle that supports a range of services. Some examples of such bundles include:
+
* Upload large datasets and overlay them up with thousands of other layers;
 +
* Share edit or view access with small or large groups;
 +
* Export data to standard formats;
 +
* Make use of powerful online geospatial tools;
 +
** Predictive mapping using world-class algorithms such as AquaMaps;
 +
** Analytical features such as clustering and trend-analysis with the custom build statistical manager;
 +
** Legacy applications for e.g. interpolation and map comparison using WPS/Hadoop;
 +
** Use our DIY approach to convert and host your application;
 +
* Georeference statistical data, occurrence data, fact-sheets, and documents online;
 +
* Publish one’s data to the world or to just a few collaborators.
  
* GeoExplorer;  
+
GeosCube is constantly being enriched with features. We are working hard on:
* GisViewer;  
+
* Annotation and commenting on maps;
 +
* Create and edit maps and link map features to rich media content including LOD;
 +
* Validation of geospatial explicit data such as names, location, and movements;
 +
* Interpolate environmental data sets to add information to occurrence data; 
 +
* Mobile client;
 +
* Field-data collection.
  
Smaller components that leverage a specific task at infrastructure level are:
+
Interested users can select services from this bundle described in detail here:  
* Apache Hadoop MapReduce to offer OGC Web Processing Service (WPS) through a web interface for the dynamic deployment of geospatial processes;
+
* [https://wiki.i-marine.eu/index.php/Catalogue:Services#Data_Visualization Data Visualization]
* TiffUploader Algorithm: a visualization purpose process to upload each layer from a map file, to a GeoServer WMS instance;
+
* Intersection Algorithm: a simple process based on the 52 North WPS algorithm, to make an intersection of two Polygons in input;
+
* Resampler Algorithm: a process that performs a resampling of a geospatial layer in netCDF-CF.
+
  
For users that only need a dataset, search and retrieve facilites are available
+
Example products that rely on services made available through this bundle in the D4Science infrastructure are:
* GeoNetwork
+
* AquaMaps; use this State-of-the-art suite to generate predictive species distribution maps;
* THREDDS
+
* ICIS; Georeference Statistical datasets;
* ..  
+
* Species Products Discovery species occurrence geospatial datasets disovery and sharing (KML / GML);
 +
* GeoExplorer; Vizualize species information, environmental information, borders and competence areas and other geospatial explicit data. View details, select layers of information and share the results.
  
Example datasets that can be discovered already in the iMarine infrastructure are:
+
== ConnectCube ==
* Species distribution map-products,
+
* Species occurrence geospatial datasets (KML / GML)
+
  
== PoliCube ==
+
'''ConnectCube''' aims to deliver information to policymakers from a variety of sources as an integrated view. These are generated using a variety of approaches, including semantic technologies.
  
The primary aim of PoliCube is to deliver information to policy makers from a variety of sources as an integrated view generated using a variety of approaches, including semantic technologies.  
+
ConnectCube offers flexible sharing, storage, reporting, search and retrieval, aggregation and projection facilities. These are primarily offered as data-driven indicators and topical fact sheets. These facilities can only be effective if a modern toolset is available to enrich or annotate existing data with relevant information in the form of e.g. uri's.  
  
PoliCube offers flexible reporting, search and retrieval, aggregation and projection facilities. These are primarily offered as data-driven indicators and topical fact sheets. These facilities can only be effective if a modern toolset is available to enrich or annotate existing data with relevant information in the form of e.g. uri's. This
+
ConnectCube includes several semantic technologies. One important objective is to identify and link equivalent concepts from different resources, in order to allow a harmonized search over datasets. The current semantic network includes entities and relationships from the domains of marine species, water areas, land areas, exclusive economic zones, and capture. It serves software applications in the domain of statistics, and GIS. The main information outlets are currently semantic factsheets. The content is also exposed via either SPARQL endpoints (suitable for semantic applications), or via JAVA API to be embedded in consumers' application code (one could also see the [http://wiki.i-marine.eu/index.php/Semantic_technologies_cluster Semantic Cluster technologies wiki page]).
  
The use of an infrastructure enables to focus on the needs of policy makers, that need to rely on dynamic reports, extracted near-real time from data coming in from multiple directions, and with varying quality and accessiblity policies attached to these data flows.
+
The use of infrastructure enables to focus on the needs of policy makers, that need to rely on dynamic reports, extracted near-real-time from data coming in from multiple directions, and with varying quality and accessibility policies attached to these data flows.
  
# Organizational features of iMarine; Workspace, messaging, mailing, user management
+
* Organizational features of D4Science; Workspace, messaging, emailing, user management
# Social tool
+
* Social tool;
# Semantic search and factsheets
+
* Semantic search and fact-sheets;
# Plugins for remote information (OAI, OpenSearch)
+
* Ontology engineering and use, especially in the fisheries domain;
 +
* Linked Open Data engineering and maintenance;
 +
* Plugins for remote information (OAI, OpenSearch)
 +
 
 +
Expected products that use semantic services from the ConnectCube bundle are:
 +
* Ecoscope; semantic fact sheets for tuna fisheries;
 +
* Smartfish; semantic factsheets on top of 3 data repositories;
 +
* FishFinder; factsheets of marine species enriched with semantic annotations.
 +
 
 +
Some of the most indicative services for this bundle are:
 +
* [https://gcube.wiki.gcube-system.org/gcube/index.php/X-Search X-Search]
 +
* [https://gcube.wiki.gcube-system.org/gcube/index.php/Search_Planning_and_Execution_Specification Search Planning and Execution]
 +
* [https://gcube.wiki.gcube-system.org/gcube/index.php/Data_Sources_Specification Data Sources Specification]
 +
 
 +
Examples of products that already rely on services offered through this bundle are:
 +
* The reporting VRE's [https://i-marine.d4science.org/group/fcpps FCPPS] and [https://i-marine.d4science.org/group/fishfindervre FishFinderVRE];
 +
* The [https://i-marine.d4science.org/group/isearch iSearch VRE];
 +
* All VRE's equipped with the social tools and workspace.
 +
 
 +
== AppsCube ==
 +
 
 +
The rapidly growing use of mobile apps for data collection and dissemination requires that content and reference data are managed from an integrated data perspective. With ever more versatile and demanding apps, data often cannot be kept in one central repository that fits all sizes. Very often, apps mash-up data from e.g. geospatial and statistical data resources, or, when used in data collection, rely on constantly updated reference data, such as of names of species, vessel characteristics, or local reporting requirements.
 +
 
 +
Modern apps D4Science an infrastructure that was designed specifically to deal with data discovery, access, and manipulation features in mind, and combines this with search and retrieval functionality over multiple resources. With the D4Science infrastructure, D4Science offers a very powerful backbone to mobile apps.
 +
 
 +
In D4Science, mobile apps are considered as data clients for data managed through the infrastructure, which are exposed to the apps (or vice-versa) through web-services. Examples are map-display in the AppliFish mobile app, and the infrastructure search enabled in the search mobile app. The D4Science infrastructure can make data available to apps through reliable connectors and can offer services that collect and validate mobile application data.
 +
 
 +
The first mobile applications in D4Science that provide evidence of the suitability of the infrastructure are:
 +
* AppliFish; The FAO species fact sheets enriched with domain-specific data (+4000 downloads!)
 +
* MobileSearch;
  
 
== IceCube ==
 
== IceCube ==
  
A key benefit of iMarine is the ease to set up scalable data processing solutions. A scalable solution may be needed because you have to manage any combination of a lot of users, a lot of data, a lot of processing, and a lot of new functionality. This requires expertise that is usually not found in one place. An infrastructure can offer more than one solution; offering a dedicated computing environment, parallelization, access to a grid or cloud environment, or outsourcing computations to external infrastructures are all options to consider. With iMarine expertise, you can ask for a technology solution, where several options can be discussed. the services available on demand can be separeated in several categories:
+
A key benefit of D4Science is the ease to set up scalable data processing solutions. A scalable solution may be needed because you have to manage any combination of a lot of users, a lot of data, a lot of processing, and a lot of new functionality. This requires expertise that is usually not found in one place. An infrastructure can offer more than one solution; offering a dedicated computing environment, parallelization, access to a grid or cloud environment, or outsourcing computations to external infrastructures are all options to consider. With D4Science expertise, you can ask for a technology solution, where several options can be discussed.  
 +
 
 +
The D4Science Integrated Computation Environment Bundle (ICE-Cube) aims to speed up not only the computational processes, but also the administrative and organizational process to select, tune, and test a new infrastructure.
 +
 
 +
The services available on-demand can be separated into several categories:
  
 
* Manage administrative scalability
 
* Manage administrative scalability
** Manage users
+
** Manage users;
** Manage virtual Organizations
+
** Manage virtual Organizations.
  
 
* Manage Functionality
 
* Manage Functionality
**
+
** Manage data in a pre-processing environment;
**
+
** Select the processes you wish to apply to your data;
 +
** Perform the computation and monitor progress, intervene if needed;
 +
** Share the results, or use in another process in the same infrastructure, eliminating the need to transfer data;
 +
** Keep a trail of the applied processes with the data results, boosting reproducibility and credibility.
  
 
* Manage Load scalability
 
* Manage Load scalability
 +
** If your computations take more time then expected, or are growing fast in number or size, more resources can be dynamically added;
 +
** If your computation is complex or unstable, D4Science can offer expertise from trained computer experts to analyze the code and propose alternative solutions.
  
 
* Manage geographic scalability
 
* Manage geographic scalability
** Keep your data an processes together to reduce
+
** Keep your data and processes together to ensure confidentiality;
** Bring your computation to your data to reduce bandwith use
+
** Bring your computation to your data to reduce bandwidth use.  
 
+
== AppsCube ==
+
 
+
The quickly growing use of mobile apps for data collection and dissemination requires that content and reference data are managed from an integrated data perspective. With ever more versatile and demanding apps, data often cannot be kept in one central repository that fits all sizes. Very often, apps mash up data from e.g. geospatial and statistical data resources, or when collection data, rely on constantly updated refrence data, such as on names of species, vessel characteristics, or local reporting requiremetns.
+
 
+
To manage a multitude of data collection and dissemination apps, an infrastructure that offers ...  
+
  
In iMarine
+
ICE-Cube is available and ready to be further exploited.

Latest revision as of 18:59, 3 December 2020

The D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. infrastructure combines the functionality of more than 500 components into a coherent and centrally managed infrastructure of hardware, software, and data resources. Together, these offer a platform that can host a variety of applications. These applications share a common theme; Provide a service to a Community of PracticeA term coined to capture an "activity system" that includes individuals who are united in action and in the meaning that "action" has for them and for the larger collective. The communities of practice are "virtual", ''i.e.'', they are not formal structures, such as departments or project teams. Instead, these communities exist in the minds of their members, are glued together by the connections they have with each other, as well as by their specific shared problems or areas of interest. The generation of knowledge in communities of practice occurs when people participate in problem solving and share the knowledge necessary to solve the problems.. Other than other infrastructures that boast size, power, performance, or the latest technology, D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. puts the community first. This does not imply we make concessions on quality or performance, but we see it as our mission to offer quality and performance to communities that have no resources of their own to jump high hurdles.

The infrastructure resembles an archipelago where applications emerge as islands of services, resting on an underlying infrastructure bedrock. The islands specialize in one or more domains, yet are not isolated 'atolls'. Every island is well connected to others, and island-hopping is strongly encouraged. Each island offers a standard set of features that can be extended by selecting services from several topical bundles.

The infrastructure currently offers 4 main domain bundles that can be customized and/or enriched into flexible, purpose-built applications. Each application in the infrastructure is tightly integrated with the underlying gCube enabling software, and can access and re-purpose data from other applications.

Through the enabling environment of gCube, all users benefit from Infrastructure Services, but where to start? For new users, D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. offers several domain-oriented solutions for 4 categories of users: data managers and analysts, biologists, spatial data managers, and policy oriented 'omnivores'. For each of these, a bundle of relevant gCube software components is available in a 'Cube'. This bundle can be limited to receive (and pay for) only those resources actually needed or consumed. A bundle can also be extended with resources coming from other bundles; our aim is to offer bundles characterized by the domain tools and not by domain boundaries. In our experience, most experts rather manage their information in a bundle of domain specific software and are only consumers of data from other bundles. Thus, in most use scenarios, a user would be a data manager in a bundle, but only a consumer in another.

The 4 key-applications that D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. has delivered and continues to enrich are:

BiolCube.png BiolCube; focuses on the management and interpretation of biodiversity data.
StatsCube.png StatsCube; a complete full life-cycle data framework, from observational data to aggregated data repositories enriched with validation and analytical tools.
GeosCube.png GeosCube; tightly connected to the BiolCube, the framework, based on OGC compliant tools and services manage the storage and interpretation of geospatial explicit information, including WPS processing.
ConnectCube.png ConnectCube; brings semantic technologies for publishing structured data so that it can be interlinked and become more useful to end-users, enabling them to produce LOD, to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.

The bundle approach, by itself an abstraction over a host of services, is expected to offer more 'flavors' in the near future. For instance, a focused approach for infrastructure support for Mobile Apps is foreseen:

  • AppsCube; offers an integrated approach to mobile app development. The infrastructure organizes the content and data-exchange with mobile apps, Please note that the App itself is not developed with D4ScienceAn e-Infrastructure operated by the D4Science.org initiative., rather it relies on the infrastructure to maintain and manage the data collected with and exposed through this App.
  • IceCube; An Integrated Computing Environment offers access to infrastructure, cloud computing resources "as a Service". An instance of such Cube will offer users access to predefined data and algorithms that can be applied to these data.

BiolCube

BiolCube is available as a suite that packs many useful features in one research environment where marine ecologists are offered a complete private work-space to manage species names and occurrence data, the main areas where BiolCube offers services:

Taxonomic and occurrence data discovery and management

  • Occurrence data finder: Download public datasets from world-class biodiversity occurrence data repositories to your private environment where you can prepare datasets for use in further analysis with D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. tools for data curation, filtering, merging and duplicate detection. Occurrence data can be directly visualized on maps using the geo-explorer, downloaded in several formats, and shared with / send to other environments.
  • Species name finder: Not sure about a species name? Then D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. offers tools to search, download and verify taxonomic and vernacular names of marine species.
  • Species name matcher: Correcting spelling mistakes or incomplete names can be very time-consuming. With D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. tools you can validate the names of species names in your data to ensure they comply with the standard of your choice. D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. offer powerful matching and reconciliation services, already in use at FAO, to identify close matches the names in your datasets. The infrastructure makes several key reference datasets available for consultation and reconciliation. These include the FAO ASFIS species list, FishBase for finfishes, and WoRMS the World Register of Marine Species. If you wish, you can add your own reference list.
  • Environmental enrichment of data: In a shared service with GeosCube, this service adds environmental information to occurrence data to improve their quality and usefulness in modelling and analytical exercises. The service allows obtaining an estimate of a range of dynamically computed environmental parameters such as water temperature, ocean color, salinity, aragonite content, or BOD. The services can identify the nearest observations in space and time and will return a computed average or nearest observation that can document an occurrence. The D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. innovative tools allow to specify what the 'nearest' means; i.e. a distance, a distance over a gradient, a seasonal average, or a depth range.

Modeling and analysis of distribution data

  • Biodiversity mapping tools: The first D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. species distribution and biodiversity mapping tool enabled the production of the well-known AquaMaps. With D4ScienceAn e-Infrastructure operated by the D4Science.org initiative., the generation became faster, more robust, and results are shared in a collaborative environment. In addition to AquaMaps, many other biodiversity analytical and predictive tools are available. These include the toolset of OpenModeler and custom build Neural Network driven analytical services.
  • Species fact-sheets generator: With scientists spread over the globe, generating consistent information sheets on marine species is no sinecure. That is why the FishFinderVRE was designed. It offers a complete templating and reporting work-flow operated by scientists, for scientists. The results, species fact-sheets, can be disseminated in a variety of formats, inp articular those established by FAO for its now famous species and regional catalogues, field guides, and the more recent pocket guides.
  • Trend-analysis of data: In a shared service with StatsCube, Trendylyzer offers services to identify and visualize trends in time-series of data. Trendylyzer was developed to specifically address skewness and gaps in datasets.
  • Spatial analysis of data: In a shared service with StatsCube, clustering, probability, and other spatial analytical features.

BiolCube is an independent yet not isolated bundle of specialized services for marine ecologists and natural aquatic resource managers. Well embedded in the D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. e-InfrastructureAn operational combination of digital technologies (hardware and software), resources (data and services), communications (protocols, access rights and networks), and the people and organizational structures needed to support research efforts and collaboration in the large., it provides access to auxiliary services that turn BiolCube in a multi-purpose toolbox for biodiversity data analysis. D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. enables near-seamless access to powerful statistical analysis software through StatsCube, advanced plotting and geospatial data production through GeosCube.

With BiolCube and StatsCube services combined, developers are now working to develop an integrated environment where species distribution can be studied in space and over time, with occurrence data analyzed using measured environmental observations, rather than estimated large scale average values.

The services that are most characteristic of this bundle are:

If you wish to learn more about using BiolCube or specific services, please contact us.

StatsCube

StatsCube offers a complete data suite to manage the entire data cycle from collection to archiving. With D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. technologies exciting new capabilities are added to the life-cycle management and analysis of especially time-series data. StatsCube is developed using state-of-the-art OpenSource components that are brought together in a managed infrastructure. This enables a very cost-effective offer to resource-poor institutes in need of sophisticated data services. Other benefits are the availability of shared services for reference data management, and harmonization of data repository services.

StatsCube relies on continued support and ongoing development of a bundle of service. This bundle offers services that together support a complete life-cycle for statistical data, but can also connect to services offered through other bundles to establish a network of cross-domain services.

The StatsCube bundle offers a set of services available to VREVirtual Research Environment. managers. They can select from this bundle to compose one or more VREVirtual Research Environment.'s, and decide who can access such services. This allows for a fine-grained approach to sometimes complex data-workflows, where data flow from detailed field level data through several aggregation and review stages until summary statistics can be produced. At each stage of such work-flow, other resources can be mobilized in support of specific activities such as geo-referencing, enrichment with environmental data, statistical modelling or analysis. With StatsCube, D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. implements key data services:

Data Work-flow If you need to manage data-flows, D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. offers life-cycle support where data enter the system as observations or batch data, and can then be harmonized and validated before being added to a repository. Not only are data well described by metadata during this process, but also the processing steps are captured as process metadata. The entire process is under the control of a 'visor' that protect the data from unauthorized access and modifications. The harmonization can rely on powerful matching features that enable to establish matches between datasets that would be very time-consuming to establish manually. Just as one would expect in a work-flow, the matching results are kept for re-use and reference. The matching is usually performed against a (long) code list, that is fully managed through the D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. infrastructure. A specialized code list manager enables the ingestion (of existing SDMX code lists), creation, and maintenance of reference lists.

Data Analysis D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. excels in offering advanced data analysis facilities to users. The clear separation of data and analytical resources makes it also easy to work with these analytical tools. The infrastructure stores the data, and no complicated steps are needed other than to select and filter the datasets and load these to the required analytical environment. For analysis, several environments are proposed, ranging from a bare-bone R-studio, parallelized R-servers, VREVirtual Research Environment.-based analytical and predictive algorithms such as AquaMaps, to the Statistical manager, where users can integrate their own logic. This logic can exploit infrastructure computing resources, or interact with external Cloud or Hadoop clusters. With D4ScienceAn e-Infrastructure operated by the D4Science.org initiative., the threshold for exploiting such resources is lowered considerably, making them accessible to a much wider, geographically dispersed EA-CoPCommunity of Practice.. Examples of analytical features implemented in D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. are:

  • Tools include R, WPS, Hadoop, WEKA data mining and access to Cloud resources;
  • Algorithms in the statistical service include DBSCAN, Neurological Networks, Clustering, and trend analysis.

Data reporting and visualization After a dataset has been added to the infrastructure, or once an analysis has been performed, the results are available in the same infrastructure to enrich reports, repositories or other infrastructure resources that can access them. Datasets are easily enriched and re-used in sometimes surprising new contexts. Some advanced facilities to work with statistical data are:

  • geo-referencing time-series, and display these on maps;
  • include time-series in reports;
  • data-graphs;
  • infrastructure services for download, sharing and sending datasets.

A few key services of this bundle are:

Examples of StatsCube implementations are

  • ICIS; a complete solution for the collection and dissemination of fisheries capture data.
  • Tuna Atlas; a focused ICIS implementation, with extended mapping capabilities provided through GeosCube.
  • TimeSeries Environment; An open free-to-use private solution of ICIS.
  • Trendylyzer; A trend-analysis toolkit for time series that have evolved over time, and have incorporated inconsistencies, gaps, and discrepancies. Trendylyzer employs a range of mining and manipulation techniques to first prepare a harmonized data-set, and then discover trends, if the data allow.

GeosCube

GeosCube is the D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. answer to the large and complex issue of understanding fisheries and biodiversity data in the spatial domain. Through GeosCube, spatial services are offered to consumers of the infrastructure, be they other D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. tools or VREVirtual Research Environment.'s, or external organizations wishing to use D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. web-services.

Through GeosCube D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. aims to offer an INSPIRE directive compliant bundle of services that will enable the generation and management of geospatial explicit data for practitioners who have no resources to develop and maintain their own spatial data infrastructure. From the onset of D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. GeosCube was seen as a service provider to several business cases. The set of services, standards and protocols that together comprise the bundle rely on W*Ss, GeoNetwork, GeoServer, and THREDDS. In D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. a catalogue is implemented using the CS-W protocol through a GeoNetwork service. The GeosCube bundles a range of OGC compliant resources that can be either made available in it's entirety or as a selection of services that can be mounted in a customized environment, such a VREVirtual Research Environment.. These VREVirtual Research Environment.'s are vertically integrated, and horizontally interoperable. They rest on the gCube infrastructure, and are thus managed through a well-defined environment, while at the same time seamlessly benefit from data and processing resources made available through that infrastructure.

GeosCube bridges the gap between powerful infrastructure-based geospatial tools and data, and lightweight web map solutions with limited processing capacity. It thus enables the use of these powerful tools for resource limited users and organizations.

GeosCube bundles the tools to:

  • Upload large datasets and overlay them up with thousands of other layers;
  • Share edit or view access with small or large groups;
  • Export data to standard formats;
  • Make use of powerful online geospatial tools;
    • Predictive mapping using world-class algorithms such as AquaMaps;
    • Analytical features such as clustering and trend-analysis with the custom build statistical manager;
    • Legacy applications for e.g. interpolation and map comparison using WPS/Hadoop;
    • Use our DIY approach to convert and host your application;
  • Georeference statistical data, occurrence data, fact-sheets, and documents online;
  • Publish one’s data to the world or to just a few collaborators.

GeosCube is constantly being enriched with features. We are working hard on:

  • Annotation and commenting on maps;
  • Create and edit maps and link map features to rich media content including LOD;
  • Validation of geospatial explicit data such as names, location, and movements;
  • Interpolate environmental data sets to add information to occurrence data;
  • Mobile client;
  • Field-data collection.

Interested users can select services from this bundle described in detail here:

Example products that rely on services made available through this bundle in the D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. infrastructure are:

  • AquaMaps; use this State-of-the-art suite to generate predictive species distribution maps;
  • ICIS; Georeference Statistical datasets;
  • Species Products Discovery species occurrence geospatial datasets disovery and sharing (KML / GML);
  • GeoExplorer; Vizualize species information, environmental information, borders and competence areas and other geospatial explicit data. View details, select layers of information and share the results.

ConnectCube

ConnectCube aims to deliver information to policymakers from a variety of sources as an integrated view. These are generated using a variety of approaches, including semantic technologies.

ConnectCube offers flexible sharing, storage, reporting, search and retrieval, aggregation and projection facilities. These are primarily offered as data-driven indicators and topical fact sheets. These facilities can only be effective if a modern toolset is available to enrich or annotate existing data with relevant information in the form of e.g. uri's.

ConnectCube includes several semantic technologies. One important objective is to identify and link equivalent concepts from different resources, in order to allow a harmonized search over datasets. The current semantic network includes entities and relationships from the domains of marine species, water areas, land areas, exclusive economic zones, and capture. It serves software applications in the domain of statistics, and GIS. The main information outlets are currently semantic factsheets. The content is also exposed via either SPARQL endpoints (suitable for semantic applications), or via JAVA API to be embedded in consumers' application code (one could also see the Semantic Cluster technologies wiki page).

The use of infrastructure enables to focus on the needs of policy makers, that need to rely on dynamic reports, extracted near-real-time from data coming in from multiple directions, and with varying quality and accessibility policies attached to these data flows.

  • Organizational features of D4ScienceAn e-Infrastructure operated by the D4Science.org initiative.; Workspace, messaging, emailing, user management
  • Social tool;
  • Semantic search and fact-sheets;
  • Ontology engineering and use, especially in the fisheries domain;
  • Linked Open Data engineering and maintenance;
  • Plugins for remote information (OAI, OpenSearch)

Expected products that use semantic services from the ConnectCube bundle are:

  • Ecoscope; semantic fact sheets for tuna fisheries;
  • Smartfish; semantic factsheets on top of 3 data repositories;
  • FishFinder; factsheets of marine species enriched with semantic annotations.

Some of the most indicative services for this bundle are:

Examples of products that already rely on services offered through this bundle are:

  • The reporting VREVirtual Research Environment.'s FCPPS and FishFinderVRE;
  • The iSearch VRE;
  • All VREVirtual Research Environment.'s equipped with the social tools and workspace.

AppsCube

The rapidly growing use of mobile apps for data collection and dissemination requires that content and reference data are managed from an integrated data perspective. With ever more versatile and demanding apps, data often cannot be kept in one central repository that fits all sizes. Very often, apps mash-up data from e.g. geospatial and statistical data resources, or, when used in data collection, rely on constantly updated reference data, such as of names of species, vessel characteristics, or local reporting requirements.

Modern apps D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. an infrastructure that was designed specifically to deal with data discovery, access, and manipulation features in mind, and combines this with search and retrieval functionality over multiple resources. With the D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. infrastructure, D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. offers a very powerful backbone to mobile apps.

In D4ScienceAn e-Infrastructure operated by the D4Science.org initiative., mobile apps are considered as data clients for data managed through the infrastructure, which are exposed to the apps (or vice-versa) through web-services. Examples are map-display in the AppliFish mobile app, and the infrastructure search enabled in the search mobile app. The D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. infrastructure can make data available to apps through reliable connectors and can offer services that collect and validate mobile application data.

The first mobile applications in D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. that provide evidence of the suitability of the infrastructure are:

  • AppliFish; The FAO species fact sheets enriched with domain-specific data (+4000 downloads!)
  • MobileSearch;

IceCube

A key benefit of D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. is the ease to set up scalable data processing solutions. A scalable solution may be needed because you have to manage any combination of a lot of users, a lot of data, a lot of processing, and a lot of new functionality. This requires expertise that is usually not found in one place. An infrastructure can offer more than one solution; offering a dedicated computing environment, parallelization, access to a grid or cloud environment, or outsourcing computations to external infrastructures are all options to consider. With D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. expertise, you can ask for a technology solution, where several options can be discussed.

The D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. Integrated Computation Environment Bundle (ICE-Cube) aims to speed up not only the computational processes, but also the administrative and organizational process to select, tune, and test a new infrastructure.

The services available on-demand can be separated into several categories:

  • Manage administrative scalability
    • Manage users;
    • Manage virtual Organizations.
  • Manage Functionality
    • Manage data in a pre-processing environment;
    • Select the processes you wish to apply to your data;
    • Perform the computation and monitor progress, intervene if needed;
    • Share the results, or use in another process in the same infrastructure, eliminating the need to transfer data;
    • Keep a trail of the applied processes with the data results, boosting reproducibility and credibility.
  • Manage Load scalability
    • If your computations take more time then expected, or are growing fast in number or size, more resources can be dynamically added;
    • If your computation is complex or unstable, D4ScienceAn e-Infrastructure operated by the D4Science.org initiative. can offer expertise from trained computer experts to analyze the code and propose alternative solutions.
  • Manage geographic scalability
    • Keep your data and processes together to ensure confidentiality;
    • Bring your computation to your data to reduce bandwidth use.

ICE-Cube is available and ready to be further exploited.