New release of the proDataMarket marketplace!

We are glad to announce second release of the proDataMarket marketplace!

What is proDataMarket Marketplace?

The proDataMarket marketplace is a virtual space that connects providers of open and proprietary real-estate and related contextual data with consumers of this data. On one hand, the marketplace aims at making it easier for data providers to publish, distribute and eventually reach out to potential consumers of their data. On the other hand, it helps data consumers discover and easily access data published at the Marketplace.

Access to the marketplace can be done through the marketplace landing page available at http://prodatamarket.eu.

proDataMarket marketplace
Landing page of the proDataMarket marketplace

 

Conceptually, there are two areas in the marketplace: Consumer site (area dedicated to data consumers in the marketplace) and Producer site (area dedicated to data producers in the marketplace).

Consumer site

Services available to data consumers have been deployed at the Consumer
Portal: https://store.prodatamarket.eu.

proDataMarket Consumer Marketplace Portal
Landing page of the proDataMarket Consumer Marketplace Portal

New look’n’feel

The data consumer services have seen further development since their initial release in the first period of the project. The Portal has been redesign following the feedback from the business case providers. New design includes landing page (see the screenshot above) with easy access to and search over the whole catalogue of data published in the marketplace.

Interactive geospatial data exploration

The Portal’s geospatial data analytics based on Amerigo Data Visualisation Service have been improved. Since its first release, map widgets have been transitioned from Leaflet to CartoDB, to support fast map rendering and provide a UI for maps configuration. New data exploration capabilities were added to the maps with data filtering widgets, that were developed on top of CartoDB. To accommodate different types of data of the business case providers, two types of filters have been implemented: discrete and continues.

data visualisation
proDataMarket marketplace data visualisation

Access to open and proprietary data

User profile management has been added to the Portal. Not-authorised users can browse through all open data available in the marketplace and samples of proprietary datasets, if their owners made them available for public. In order to get access to proprietary data, users have to sign up.

Purchase of proprietary data

Finally, authorised users can now buy proprietary data in the marketplace through the payment component, new feature of the Portal.

The purchase itself is implemented in the Purchase Management component that takes as input instructions about which data or data subsets are sold at which price. These instructions are passed to the component via a subscription configuration file. At the moment this file is prepared by the technical partners (SpazioDati, SINTEF and Ontotext) based on configuration options received from data producers (see data publisher instructions). In the future, data producers will be able to generate configuration file automatically using the  Data Pricing Setup component.

Please, note, this feature is available for proprietary paid datasets only such as “Social Network Thermometer by Municipality”, as demonstrated in the screenshot below. Open data (e.g., “State-owned buildings by municipality”) is public and free, hence, no subscription options are shown.

Subscription options
Proprietary dataset with subscription options

Producer site

Services available to data producers have been deployed at http://publish.prodatamarket.eu, the DataGraft portal. The latest release of the DataGraft portal has been announced in the recent blog post.

Current release of the marketplace includes a tutorial for data producers that describes the process of data publication from setting up a database to cleaning data and populating the database, to cataloging  data and configuring it visualisation at the at the Consumer portal. The tutorial is available at https://store.prodatamarket.eu/publisher_help/.

proDataMarket marketplace help for data producers
proDataMarket marketplace help for data producers

Marketplace Platform overview

Technical platform of the marketplace is composed of the tools, services and infrastructure developed to support two types of users: producers and consumers. Diagram below gives an overview of the marketplace and services it provides for data producers and data consumers.

Overview of the marketplace platform
Overview of the marketplace platform

The proDataMarket Ontology: Enabling Semantic Interoperability of Real Property Data

Real property data (often referred to as real estate, realty, or immovable property data) represent a valuable asset that has the potential to enable innovative services when integrated with related contextual data (e.g., business data). Such services can range from providing evaluation of real estate to reporting on up-to-date information about state-owned properties. Real property data integration is a difficult task primarily due to the heterogeneity and complexity of the real property data, and the lack of generally agreed upon semantic descriptions of the concepts in this domain. The proDataMarket ontology is developed in the project as a key enabler for integration of real property data.

The proDataMarket ontology design and development process followed techniques and design choices supported by existing methodologies, mainly the one proposed by Noy [1]. Requirements are extracted from a set of relevant business cases and competency questions [2] are defined for each business case, so as core concepts and relationships. A conceptual model is then developed based on the requirements mentioned above and international standards including ISO 19152:2012 and European Union’s INSPIRE data specifications. For example, the LADM conceptual model from ISO 19152:2012 is used as reference model to the proDataMarket cadastral domain conceptual model. Afterwards we implemented the conceptual model using RDFS/OWL linked data standard. RDFS is used to model concepts, properties and simple relationships such as rdfs:subClassOf. OWL is built upon RDFS and provides a richer language for web ontology modelling and it is used to model constraints and other advanced relationships, such as the cardinality constraint needed to express the relationship between properties and buildings.

The proDataMarket ontology can be accessed at http://vocabs.datagraft.net/proDataMarket/. The ontology has been divided into several sub-ontologies (see Table below), reflecting the cross-domain nature of the requirements. This modular approach also helped to handle the complexity of the model and made it easier to maintain. In the current version, there are 11 sub-ontologies with 43 native classes and 43 native properties.

Table: Composition of the proDataMarket ontology

Domain/module Namespace prefix URL Classes Properties Business cases
Common prodm-com http://vocabs.datagraft.net/proDataMarket/0.1/Common# 4 4 ALL
Cadaster prodm-cad http://vocabs.datagraft.net/proDataMarket/0.1/Cadastre# 6 16 SoE, RVAS, NNAS, SIM
State of Estate Report prodm-soe http://vocabs.datagraft.net/proDataMarket/0.1/SoE# 4 2 SoE, RVAS
Business Entity-Reuse the existing vocabularies, no new classes and properties 0 0 SoE, RVAS
Building Accessibility-Reuse the existing vocabularies, no new classes and properties 0 0 SoE
Natural Hazard prodm-nh http://vocabs.datagraft.net/proDataMarket/0.1/NaturalHazard# 1 0 RVAS
Land Parcel Identification System (LPIS) prodm-lpis http://vocabs.datagraft.net/proDataMarket/0.1/LPIS# 1 7 CAPAS
Sentinel data prodm-sen http://vocabs.datagraft.net/proDataMarket/0.1/Sentinel# 1 1 CAPAS
Landscape Elements (LiDAR data) prodm-lid http://vocabs.datagraft.net/proDataMarket/0.1/Lidar# 3 0 CAPAS
Assessment prodm-asm http://vocabs.datagraft.net/proDataMarket/0.1/Assessment# 3 3 CAPAS
CensusTract prodm-ct http://vocabs.datagraft.net/proDataMarket/0.1/CensusTract# 1 0 CST,CCRS
Urban Infrastructure prodm-ui http://vocabs.datagraft.net/proDataMarket/0.1/UrbanInfrastructure# 17 10 SIM
Protected Sites prodm-ps http://vocabs.datagraft.net/proDataMarket/0.1/ProtectedSite# 2 0 CAPAS
Total: 43 43

More than 30 datasets have been published through the DataGraft platform [3] [4] using the proDataMarket ontology as a central reference model. All seven business cases use the proDataMarket ontology in data publishing.

More details on the proDataMarket vocabulary will be found in the paper “The proDataMarket Ontology for Publishing and Integrating Cross-domain Real Property Data” that was accepted for publication in scientific journal Territorio ItaliaLand AdministrationCadastre and Real Estate [5].

References

  • [1] Noy, Natalya F., and Deborah L. McGuinness. “Ontology development 101: A guide to creating your first ontology.” (2001).
  • [2] Grüninger, Michael, and Mark S. Fox. “Methodology for the Design and Evaluation of Ontologies.” (1995).
  • [3] Roman, D., et al. DataGraft: One-Stop-Shop for Open Data Management. 2017. Semantic Web, vol. Preprint, no. Preprint, pp. 1-19, 2017. DOI: 10.3233/SW-170263.
  • [4] Roman, D., et al. DataGraft: Simplifying Open Data Publishing. ESWC (Satellite Events) 2016: 101-106.
  • [5] L. Shi, N. Nikolov, D. Sukhobokb, T. Tarasova and D. Roman. “The proDataMarket Ontology for Publishing and Integrating Cross-domain Real Property Data”. To appear in the journal “Territorio Italia Land Administration, Cadastre and Real Estate”. n.2/2017.

Integrating multisectoral datasets: from satellites to real estate scoring model

During a project meeting in Sofia on September 21, 2016, Cerved teamed up with TRAGSA to brainstorm ideas of re-using the TRAGSA methods for processing satellite imagery to analyse green areas in urbanized cities.

Fundamentals of Tragsa Processing

A common feature in Vegetation Spectra is the high contrast observed between the red band and the Near Infrared (NIR) region. The optical instrument carried by Sentinel 2 satellites samples 13 spectral bands, including high resolution bands in the red (bands 4, 5 & 6) as well as bands in the NIR (8 & 8A). Refer to this blog post for more details about processing Sentinel 2 data.

Using the TRAGSA methodology it is possible to isolate and enhance the vegetation, to locate green areas in urban areas. Green areas are important input to the Cerved’s innovative real estate evaluation model (which is being developed within one of the Cerved’s business cases in the project, as introduced in this blog post). Cerved uses open data, to generate indicators of green areas defined for the model: green area coverage and distance to the wood. Operations that Cerved performs to compute these indicators are similar to those that TRAGSA does on satellite data, such as clustering of green areas into big areas and isolating trees and group of trees. This motivated us to experiment with satellite data and TRAGSA’s methodology, to see whether we could potentially use more complete, structured and up-to-date source of green areas information as input to our real estate evaluation model.

Experiment

We identified a highly urbanized Italian city but with particular attention to green areas, which is the city of Turin.

The steps that we followed:

  • extraction of city boundaries of Turin in GeoJSON format by SPAZIODATI
  • selections of good quality imagery for Turin from the Sentinel data repository by TRAGSA
  • processing S2 imagery in order to get a vector layer which indicates the presence or absence of a green area in each pixel (1/0) by TRAGSA
  • display of the green areas of the tiles (see the screenshot below) prototype Amerigo visualisation service, under development by SPAZIODATI
  • data processing and aggregation of the tiles into census cells areas, in order to develop green areas indicators for each census cell, by CERVED
  • integration and testing of the score dedicated to green areas within the business model CCRS (Cerved Cadastral Report Service) by CERVED

image001

The result of this experiment was extremely surprising; the detail and accuracy of this new score in identifying the green areas (not only public green areas) is far greater than accuracy of the other scores, developed on public and open green areas of datasets.

Cerved and SpazioDati at Data Driven Innovation 2016

Cerved and SpazioDati participated in the first edition of Data Driven Innovation 2016 with a presentation and a stand about preliminary results of their collaborative work in the ProDataMarket project.

Cerved & SpazioDati present the first prototype for proDataMarket @DataDrivenInnovation 2016
Cerved & SpazioDati present the first prototype for proDataMarket @DataDrivenInnovation 2016

 

Data Driven Innovation is an open summit about big data hosted by Roma Tre university and organized by Codemotion. During two days of the summit many people have had the possibility to see the first results of Cerved & SpazioDati proDataMarket project: the Cerved Scouting Terrain Service (CST), an interactive map showing Bologna territory scores and social demographic scores, as the social disease index, the economic disease index, the socio-demographic score and much more territory scores.

CST, 2d business case of Cerved: Employees of the working population in Bologna
CST, 2d business case of Cerved: Employees of the working population in Bologna

 

CST is the second business case Cerved is being developed within the proDataMarket project: the goal of this service is to provide target users with a tool to search and see property and territory information on a map. In order to achieve this, Cerved is developing value-added geo-marketing indicators, analyses and visualisations.

Authors: Claudio Castelli & Diego Sanvito

ProDataMarket place as a toll for connecting real-estate data publishers and prospect data consumers

The main objective of the ProDataMarket project is to create a data marketplace for open and proprietary real-estate and related contextual data.

Marketplace is a place where data producers meet prospect data consumers. In addition to basic features for making data accessible and discoverable, marketplace can provide more tools to help data producers “advertise” their data and better engage with potential data consumers. Among such tools are those that help data producers explain the type of their data, its attributes and demonstrate its value. In this post we discuss how these tools are being realised in the ProDataMarket place.

Driving example

Let’s consider a national statistical office, for example, the Italian National Institute of Statistics (ISTAT). ISTAT wants to disseminate one of its datasets, a dataset with census cells that cover the Italian region of Piemonte. This dataset subdivides the region of Piemonte in census sections according to ISTAT’s 2011 National Census. A census section is the smallest geographic unit for which the statistical variables of a population census are taken.

ISTAT is interested in explaining to the prospect data consumers that the data can be useful when it is needed to:

  • determine inter-municipal boundaries
  • describe different areas of a city in terms of some geographically-bound characteristics

Marketplace: initial steps

Figure 1 illustrates initial steps that ISTAT performs at the marketplace to present her data.

Figure 1: The data producer prepares, describes and publishes her data at the marketplace, to make accessible and discoverable.

 

ISTAT prepares its data for publication, describes and catalogues it. Now, a prospect data consumer can discover and explore the dataset of census cells of the Piemonte region. While ISTAT made the data accessible and discoverable, data consumers still have to figure our themselves what type of data it is, what is inside and what is it useful for.

Marketplace: explaining the data types

To explain the type of the data, ISTAT creates and attaches visualisations to its data, as shown in Fig. 2.

Figure 2: The data producer creates visualisations, to explain the type of the data

 

In addition to preparing, describing and publishing Piemonte census sections dataset, ISTAT can create a map of all the census cells of the Piemonte region. This gives an illustrative example of the data to the prospect data consumers: when exploring the dataset, the data consumer can immediately see that the data contains polygons, each of which represents a geographic area of a census section.

Now that the type of the data is clearer, ISTAT can go further and explain various attributes of the data.

Marketplace: explaining attributes of the data 

Figure 3 illustrates steps that ISTAT performs at the marketplace, to give the data consumers a glimpse of the data attributes.

Figure 3: The data producer queries the data, to explain data attributes.

 

As mentioned above, the dataset of the driving example contains census cells’ geometries. Every cell is attach to a certain municipality. This information becomes useful if one wants to represent single municipalities on a map. For example, to represent the city of Turin, ISTAT can prepare a subset of the census cells by filtering on the municipality attribute of each cell. Similarly, other attributes of the data can be highlighted.

Marketplace: putting data into context to explain its value

With the help of the marketplace, ISTAT can prepare, describe and visualise as many subsets of the data, as she wants to. Finally, to showcase the value of the data and explain to the data consumer its value, ISTAT can put census cells into context, as illustrated in Fig. 4.

Figure 4: The data producer augments its data from other data sources, to show the “value in context”.

 

This last approach is realised through the Augmentation Service that supports querying a co-located data source using several functions to produce a new dataset. Currently, the Augmentation Service uses data from OpenStreetMap, to provide context. For example, ISTAT can use the service to extract the number of bus stops found nearby each census cell, or the distance to the closest train station, or the length of pedestrian paths in each census cell. Once the new augmented dataset is prepared, ISTAT can proceed with visualisations. For example, she can create a coloured map to show density of nearby bus stops in Turin.

SpazioDati in proDataMarket project

SpazioDati, founded in 2012 in Trento, is a Big Data & Semantic Web company. Its main product, Dandelion API, enables users to analyze and enrich their content by connecting it with a Knowledge Graph. SpazioDati’s Graph holds millions of facts, which are collected by combining Open Data sources with proprietary, high­-quality, data provided by partners.

 

From text to actionable data
SpazioDati’s services allow extraction of meaning from unstructured text and thus enabling further operations on the contextual information. We support a wide range of use cases such as: enriching existing databases, building smart search engines and recommender systems on document collections, adding location knowledge to web apps, automatically tagging products on e-commerce sites, using data to create infographics and marketing research and many more!

 

Role in proDataMarket project
SpazioDati will act as the technical coordinator of proDataMarket. We will support other partners helping them to deliver better products. We will provide a data platform where various third party data providers and data consumers will be able to interact in a novel and modern manner. proDataMarket will enable efficient and effective processes for opening up and reuse of multilingual property data and allow for innovative new ways to consume property-related data. The platform will generate wider use of those data, and enrich them through transfer of innovative analytic solutions and services.

The infrastructure we are building will allow end-users to have access on reliable, up to date, and consistent property-related data through a simple API, allowing them to build better product.

Find out more about SpazioDati.