CAPAS Business Case: results & outlook

Tragsa has developed the CAPAS service which integrates multi-sectorial data for better Common Agriculture Policy (CAP) funds assignments to farmers and land owners. Several external datasets – as LiDAR , Copernicus Sentinel2 and Protected Sites from the Spanish Environment and Agriculture Ministry, among others- has been used to improve the Spanish Land Parcel Identification System.

Products from LiDAR data

LiDAR files are a collection of points stored as tuples which represent longitude, latitude, and elevation. This data is provided by the Spanish National Geographic Institute (IGN). This data was processed using automatic algorithms to detect landscape elements (copses and isolated trees) within agricultural parcels.

Protected sites and ecological value report

On one hand, the density of isolated trees and the presence of copses were evaluated with the Landscape Elements Value. On the other hand, the presence/absence of protected areas that intersects subplots was evaluated with a score named Protected sites Value. The result of the sum of Protected Sites Value plus Landscape Elements Value is an Ecological value.

The full description of these products and how they were generated and their validation is explained here.

Products from Sentinel2 data

The Sentinels are a fleet of satellites for land monitoring which is part of the European Copernicus program. The products generated from satellite data were explained in a previous blogpost.

Every week, the images with low cloud cover percentage were downloaded and processed to generate three single products (true colour image, false colour image and NDVI). For the pilot area, Castile and Madrid regions, a total amount of 168 tiles were processed during the year 2017 (until the 31st of August). The irrigation maps were generated in two pilot areas. They were evaluated and they proved to be helpful to identify the crops in control tasks.

Other products generated by CAPAS have been used to update the LPIS database.  For example, the grassland layer displays actual grassland areas. The change detection layer highlights the changes happened since the last updating of LPIS and it is focused in changes between agricultural land, forests, and grassland areas.

Change detection layer in TAEJ
Change detection layer in TAEJ
Legend changes

Grassland layer in LUPI
Grassland layer in LUPI
 Legend grassland

Conclusion

Many innovative products were generated by CAPAS business case leveraging previously under-used data. The different methodologies and derived products proved a high success ratio after several tests and all the resulting data can be obtained and visualized on the ProDataMarket platform.

New release of the proDataMarket marketplace!

We are glad to announce second release of the proDataMarket marketplace!

What is proDataMarket Marketplace?

The proDataMarket marketplace is a virtual space that connects providers of open and proprietary real-estate and related contextual data with consumers of this data. On one hand, the marketplace aims at making it easier for data providers to publish, distribute and eventually reach out to potential consumers of their data. On the other hand, it helps data consumers discover and easily access data published at the Marketplace.

Access to the marketplace can be done through the marketplace landing page available at http://prodatamarket.eu.

proDataMarket marketplace
Landing page of the proDataMarket marketplace

 

Conceptually, there are two areas in the marketplace: Consumer site (area dedicated to data consumers in the marketplace) and Producer site (area dedicated to data producers in the marketplace).

Consumer site

Services available to data consumers have been deployed at the Consumer
Portal: https://store.prodatamarket.eu.

proDataMarket Consumer Marketplace Portal
Landing page of the proDataMarket Consumer Marketplace Portal

New look’n’feel

The data consumer services have seen further development since their initial release in the first period of the project. The Portal has been redesign following the feedback from the business case providers. New design includes landing page (see the screenshot above) with easy access to and search over the whole catalogue of data published in the marketplace.

Interactive geospatial data exploration

The Portal’s geospatial data analytics based on Amerigo Data Visualisation Service have been improved. Since its first release, map widgets have been transitioned from Leaflet to CartoDB, to support fast map rendering and provide a UI for maps configuration. New data exploration capabilities were added to the maps with data filtering widgets, that were developed on top of CartoDB. To accommodate different types of data of the business case providers, two types of filters have been implemented: discrete and continues.

data visualisation
proDataMarket marketplace data visualisation

Access to open and proprietary data

User profile management has been added to the Portal. Not-authorised users can browse through all open data available in the marketplace and samples of proprietary datasets, if their owners made them available for public. In order to get access to proprietary data, users have to sign up.

Purchase of proprietary data

Finally, authorised users can now buy proprietary data in the marketplace through the payment component, new feature of the Portal.

The purchase itself is implemented in the Purchase Management component that takes as input instructions about which data or data subsets are sold at which price. These instructions are passed to the component via a subscription configuration file. At the moment this file is prepared by the technical partners (SpazioDati, SINTEF and Ontotext) based on configuration options received from data producers (see data publisher instructions). In the future, data producers will be able to generate configuration file automatically using the  Data Pricing Setup component.

Please, note, this feature is available for proprietary paid datasets only such as “Social Network Thermometer by Municipality”, as demonstrated in the screenshot below. Open data (e.g., “State-owned buildings by municipality”) is public and free, hence, no subscription options are shown.

Subscription options
Proprietary dataset with subscription options

Producer site

Services available to data producers have been deployed at http://publish.prodatamarket.eu, the DataGraft portal. The latest release of the DataGraft portal has been announced in the recent blog post.

Current release of the marketplace includes a tutorial for data producers that describes the process of data publication from setting up a database to cleaning data and populating the database, to cataloging  data and configuring it visualisation at the at the Consumer portal. The tutorial is available at https://store.prodatamarket.eu/publisher_help/.

proDataMarket marketplace help for data producers
proDataMarket marketplace help for data producers

Marketplace Platform overview

Technical platform of the marketplace is composed of the tools, services and infrastructure developed to support two types of users: producers and consumers. Diagram below gives an overview of the marketplace and services it provides for data producers and data consumers.

Overview of the marketplace platform
Overview of the marketplace platform

New Demo Papers at ISWC 2017

Sukhobok,D., H. Sanchez, J. Estrada, D. Roman. Linked Data for Common Agriculture Policy: Enabling Semantic Querying over Sentinel-2 and LiDAR Data. International Semantic Web Conference. Demo paper. 2017. To appear.

  • Abstract: The amount of open and free satellite earth observation data combined with available data from other sectors (e.g. biodiversity, landscape elements, cadaster data) has the potential to enhance decision-making processes in various domains. An example of such a domain is agriculture, where the ability to objectively and automatically identify different types of agricultural features (e.g., irrigation patterns and landscape elements) can lead to more effective agriculture management. In this paper we show the possibility to publish and integrate multi-sectoral data from several sources into an existing data-intensive service targeting better and fairer Common Agriculture Policy (CAP) funds assignments to farmers and land owners. We show an end-to-end approach for integrating multi-sectoral data and publishing the result as Linked Data with the help of the DataGraft platform. To demonstrate the use of the resulted dataset, we developed a visualization system prototype showing various information about agricultural parcel features.
  • Download paper

Sukhobok, D., Nikolov, N., Lech, T. C., Moberg, A.-H., Frantsvag, R., Bergaas, H. R., Roman, D. . Interacting with Subterranean Infrastructure Linked Data using Augmented Reality. International Semantic Web Conference. Demo paper. 2017. To appear.

  • Abstract: Subterranean infrastructure damages caused by excavation works of all kinds are costly and potentially dangerous for workers. Such damages are often caused by poor subterranean data or inappropriate use of the existing data. We aim to provide solutions and services that will hinder obstacles related to the use of subterranean infrastructure data to ensure less damage and less time spent on finding and integrating data about subterranean infrastructure. The result of the work reported in this paper is an augmented reality application that can provide users the ability to see what subterranean infrastructure is located at a given physical location. In this paper we demonstrate a method to create such an application using Linked Data technologies.
  • Download paper

Sukhobok, D. Djordjevic, D. Sanvito and D. Roman. Publishing Socio-Economic Territory Indices as Linked Data and their Visualization for Real Estate Valuation. International Semantic Web Conference. Demo paper. 2017. To appear.

  • Abstract: The correct estimation of the real estate value facilitates decision making in various sectors, such as public administration or the real estate market. In this paper we demonstrate a method to manage territory scores and property valuation estimations as Linked Data with
    the help of the proDataMarket technical framework. The demo illustrates how the proDataMarket technical framework can be used to generate, maintain and serve territory and property valuation estimation data with the help of semantic technologies.
  • Download paper

Shi, L., Pettersen, B. E., Sukhobok, D., Nikolov N., and Roman, D. Linked Data for the Norwegian State of Estate Reporting Service. International Semantic Web Conference. Demo paper. 2017. To appear.

  • Abstract: The Norwegian State of Estate (SoE) report includes information about all Norwegian state-owned properties and buildings in the public sector and aims to assist government decision makers to allocate resources more effectively. A Linked Data based approach is presented here to increase the transparency in the government administration, improve the report generating process and also the report quality. Cross-domain government data originated from the business entity register, the cadastral system, the building accessibility register and the old SoE report are acquired, prepared, cleaned, transformed to Linked Data format and published. The source datasets are then integrated, augmented and interlinked before the results are published as a SPARQL endpoint, used for data visualization and report generation.
  • Download paper

Roman, D., Paniagua, J., Tarasova, T., Georgiev, G., Sukhobok, D., Nikolov, N., and Lech, T. C. proDataMarket: A Data Marketplace for Monetizing Linked Data. International Semantic Web Conference. Demo paper. 2017. To appear.

  • Abstract: Linked data has emerged as an interesting technology for publishing structured data on the Web but also as a powerful mechanism for integrating disparate data sources. Various tools and approaches have been developed in the semantic Web community to produce and consume linked data, however little attention has been paid to monetization of linked data. In this paper we introduce a data marketplace – proDataMarket – that enables data providers to generate, advertise, and sell linked data, and data consumers to purchase linked data on the marketplace. The marketplace was originally designed with a focus on geospatial linked data (targeting property-related data providers and consumers) but its capabilities are generic and can be used for data in various domains. This demo will highlight the capabilities offered to the providers and consumers of the data made available on the marketplace.
  • Download paper

Nikolov, N., Sukhobok, D., Dragnev, S., Dalgard, S., Elvesæter, B., von Zernichow, B. M., and Roman, D. DataGraft beta v2: New Features and Capabilities. International Semantic Web Conference. Demo paper. 2017. To appear.

  • Abstract: In this demonstrator, we will introduce the latest features and capabilities added to DataGraft – a Data-as-a-Service platform for data preparation and knowledge graph generation. DataGraft provides data transformation, publishing and hosting capabilities that aim to simplify the data publishing lifecycle for data workers (i.e., Open Data publishers, Linked Data developers, data scientists). This demonstrator highlights the recent features added to DataGraft by exemplifying data publication of statistical data – going from the raw data published at a public portal to published and accessible Linked Data with the help of the tools and features of the platform.
  • Download paper

New Papers at ODBASE 2017

Shi, D. Sukhobok, N. Nikolov and D. Roman. Norwegian State of Estate Report as Linked Open Data. To appear in the proceedings of ODBASE 2017 – The 16th International Conference on Ontologies, DataBases, and Applications of Semantics, Springer, 24-25 October 2017, Rhodes, Greece.

  • Abstract: This paper presents the Norwegian State of Estate (SoE) dataset containing data about real estates owned by the central government in Norway. The dataset is produced by integrating cross-domain government datasets including data from sources such as the Norwegian business entity register, cadastral system, building accessibility register and the previous SoE report. The dataset is made available as Linked Data. The Linked Data generation process includes data acquisition, cleaning, transformation, annotation, publishing, augmentation and interlinking the annotated data as well as quality assessment of the interlinked datasets. The dataset is published under the Norwegian License for Open Government Data (NLOD) and serves as a reference point for applications using data on central government real estates, such as generation of the SoE report, searching properties suitable for asylum reception centres, risk assessment for state-owned buildings or a public building application for visitors.
  • Download paper

M. von Zernichow and D. Roman. Usability of Visual Data Profiling in Data Cleaning and Transformation. To appear in the proceedings of ODBASE 2017 – The 16th International Conference on Ontologies, DataBases, and Applications of Semantics, Springer, 24-25 October 2017, Rhodes, Greece.

  • Abstract: This paper presents the Norwegian State of Estate (SoE) dataset containing data about real estates owned by the central government in Norway. The dataset is produced by integrating cross-domain government datasets including data from sources such as the Norwegian business entity register, cadastral system, building accessibility register and the previous SoE report. The dataset is made available as Linked Data. The Linked Data generation process includes data acquisition, cleaning, transformation, annotation, publishing, augmentation and interlinking the annotated data as well as quality assessment of the interlinked datasets. The dataset is published under the Norwegian License for Open Government Data (NLOD) and serves as a reference point for applications using data on central government real estates, such as generation of the SoE report, searching properties suitable for asylum reception centres, risk assessment for state-owned buildings or a public building application for visitors.
  • Download paper

Roman, D. Sukhobok, N. Nikolov, B. Elvesæter and A. Pultier. The InfraRisk Ontology: Enabling Semantic Interoperability for Critical Infrastructures at Risk from Natural Hazards. To appear in the proceedings of ODBASE 2017 – The 16th International Conference on Ontologies, DataBases, and Applications of Semantics, Springer, 24-25 October 2017, Rhodes, Greece.

  • Abstract: Earthquakes, landslides, and other natural hazard events have severe negative socio-economic impacts. Among other consequences, those events can cause damage to infrastructure networks such as roads and railways. Novel methodologies and tools are needed to analyse the potential impacts of extreme natural hazard events and aid in the decision-making process regarding the protection of existing critical road and rail infrastructure as well as the development of new infrastructure. Enabling uniform, integrated, and reliable access to data on historical failures of critical transport infrastructure can help infrastructure managers and scientist from various related areas to better understand, prevent, and mitigate the impact of natural hazards on critical infrastructures. This paper describes the construction of the InfraRisk ontology for representing relevant information about natural hazard events and their impact on infrastructure components. Furthermore, we present a software prototype that visualizes data published using the proposed ontology.
  • Download paper

New Paper: Data Preparation as a Service Based on Apache Spark

Mahasivam N., Nikolov N., Sukhobok D., Roman D. (2017) Data Preparation as a Service Based on Apache Spark. In: De Paoli F., Schulte S., Broch Johnsen E. (eds) Service-Oriented and Cloud Computing. ESOCC 2017. Lecture Notes in Computer Science, vol 10465. Springer, Cham

  • Abstract: Data preparation is the process of collecting, cleaning and consolidating raw datasets into cleaned data of certain quality. It is an important aspect in almost every data analysis process, and yet it remains tedious and time-consuming. The complexity of the process is further increased by the recent tendency to derive knowledge from very large datasets. Existing data preparation tools provide limited capabilities to effectively process such large volumes of data. On the other hand, frameworks and software libraries that do address the requirements of big data, require expert knowledge in various technical areas. In this paper, we propose a dynamic, service-based, scalable data preparation approach that aims to solve the challenges in data preparation on a large scale, while retaining the accessibility and flexibility provided by data preparation tools. Furthermore, we describe its implementation and integration with an existing framework for data preparation – Grafterizer. Our solution is based on Apache Spark, and exposes application programming interfaces (APIs) to integrate with external tools. Finally, we present experimental results that demonstrate the improvements to the scalability of Grafterizer.
  • Download paper