The proDataMarket Ontology: Enabling Semantic Interoperability of Real Property Data

Real property data (often referred to as real estate, realty, or immovable property data) represent a valuable asset that has the potential to enable innovative services when integrated with related contextual data (e.g., business data). Such services can range from providing evaluation of real estate to reporting on up-to-date information about state-owned properties. Real property data integration is a difficult task primarily due to the heterogeneity and complexity of the real property data, and the lack of generally agreed upon semantic descriptions of the concepts in this domain. The proDataMarket ontology is developed in the project as a key enabler for integration of real property data.

The proDataMarket ontology design and development process followed techniques and design choices supported by existing methodologies, mainly the one proposed by Noy [1]. Requirements are extracted from a set of relevant business cases and competency questions [2] are defined for each business case, so as core concepts and relationships. A conceptual model is then developed based on the requirements mentioned above and international standards including ISO 19152:2012 and European Union’s INSPIRE data specifications. For example, the LADM conceptual model from ISO 19152:2012 is used as reference model to the proDataMarket cadastral domain conceptual model. Afterwards we implemented the conceptual model using RDFS/OWL linked data standard. RDFS is used to model concepts, properties and simple relationships such as rdfs:subClassOf. OWL is built upon RDFS and provides a richer language for web ontology modelling and it is used to model constraints and other advanced relationships, such as the cardinality constraint needed to express the relationship between properties and buildings.

The proDataMarket ontology can be accessed at http://vocabs.datagraft.net/proDataMarket/. The ontology has been divided into several sub-ontologies (see Table below), reflecting the cross-domain nature of the requirements. This modular approach also helped to handle the complexity of the model and made it easier to maintain. In the current version, there are 11 sub-ontologies with 43 native classes and 43 native properties.

Table: Composition of the proDataMarket ontology

Domain/module Namespace prefix URL Classes Properties Business cases
Common prodm-com http://vocabs.datagraft.net/proDataMarket/0.1/Common# 4 4 ALL
Cadaster prodm-cad http://vocabs.datagraft.net/proDataMarket/0.1/Cadastre# 6 16 SoE, RVAS, NNAS, SIM
State of Estate Report prodm-soe http://vocabs.datagraft.net/proDataMarket/0.1/SoE# 4 2 SoE, RVAS
Business Entity-Reuse the existing vocabularies, no new classes and properties 0 0 SoE, RVAS
Building Accessibility-Reuse the existing vocabularies, no new classes and properties 0 0 SoE
Natural Hazard prodm-nh http://vocabs.datagraft.net/proDataMarket/0.1/NaturalHazard# 1 0 RVAS
Land Parcel Identification System (LPIS) prodm-lpis http://vocabs.datagraft.net/proDataMarket/0.1/LPIS# 1 7 CAPAS
Sentinel data prodm-sen http://vocabs.datagraft.net/proDataMarket/0.1/Sentinel# 1 1 CAPAS
Landscape Elements (LiDAR data) prodm-lid http://vocabs.datagraft.net/proDataMarket/0.1/Lidar# 3 0 CAPAS
Assessment prodm-asm http://vocabs.datagraft.net/proDataMarket/0.1/Assessment# 3 3 CAPAS
CensusTract prodm-ct http://vocabs.datagraft.net/proDataMarket/0.1/CensusTract# 1 0 CST,CCRS
Urban Infrastructure prodm-ui http://vocabs.datagraft.net/proDataMarket/0.1/UrbanInfrastructure# 17 10 SIM
Protected Sites prodm-ps http://vocabs.datagraft.net/proDataMarket/0.1/ProtectedSite# 2 0 CAPAS
Total: 43 43

More than 30 datasets have been published through the DataGraft platform [3] [4] using the proDataMarket ontology as a central reference model. All seven business cases use the proDataMarket ontology in data publishing.

More details on the proDataMarket vocabulary will be found in the paper “The proDataMarket Ontology for Publishing and Integrating Cross-domain Real Property Data” that was accepted for publication in scientific journal Territorio ItaliaLand AdministrationCadastre and Real Estate [5].

References

  • [1] Noy, Natalya F., and Deborah L. McGuinness. “Ontology development 101: A guide to creating your first ontology.” (2001).
  • [2] Grüninger, Michael, and Mark S. Fox. “Methodology for the Design and Evaluation of Ontologies.” (1995).
  • [3] Roman, D., et al. DataGraft: One-Stop-Shop for Open Data Management. 2017. Semantic Web, vol. Preprint, no. Preprint, pp. 1-19, 2017. DOI: 10.3233/SW-170263.
  • [4] Roman, D., et al. DataGraft: Simplifying Open Data Publishing. ESWC (Satellite Events) 2016: 101-106.
  • [5] L. Shi, N. Nikolov, D. Sukhobokb, T. Tarasova and D. Roman. “The proDataMarket Ontology for Publishing and Integrating Cross-domain Real Property Data”. To appear in the journal “Territorio Italia Land Administration, Cadastre and Real Estate”. n.2/2017.

Satellite images applied to property data

The Sentinels are a fleet of satellites designed specifically to deliver the wealth of data and imagery that are central to the European Commission’s Copernicus programme . This unique environmental monitoring programme is making a step change in the way we manage our environment, understand and tackle the effects of climate change and safeguard everyday lives. Sentinel-2 carries an innovative wide swath high-resolution multispectral imager with 13 spectral bands for a new perspective of our land and vegetation. The combination of high resolution, novel spectral capabilities, a swath width of 290 km and frequent revisit times is generating unprecedented views of Earth. Sentinel-2 is providing information for agricultural and forestry practices and for helping manage food security. Satellite images will be used to determine various crop and plant indexes. Some examples of these parameters could be:

  • Normalised Difference Vegetation Index (NDVI)
  • Normalised Difference Snow and Ice Index (NDSI)
  • Enhanced vegetation index (EVI)

This is particularly important for effective crops production prediction and applications related to Earth’s vegetation.

SentinelExampleSentinel use example

Sentinel-2 is the first optical Earth observation mission of its kind to include three bands in the ‘red edge’, which provide key information on the state of vegetation. In the previous image from 6 July 2015 acquired near Toulouse, France, the satellite’s multispectral instrument was able to discriminate between two types of crops: sunflower (in orange) and maize (in yellow).
These new and advanced datasets will be used inside CAPAS Business case to improve and enrich the information already obtained using LIDAR datasets (What is LIDAR?). Indeed, using LIDAR is possible to obtain accurate surface maps. However, data updates frequency is not very high. On the other hand, Sentinel constellation has a very high revisit frequency (five days) and offers information about kind of crops and their evolution. In conclusion, the use and merging of those different datasets answer several question regarding CAP parameters:

  • Is a specific parcel cultivated?
  • What kind of crop is growing in a plot?
  • Has the number of trees of a copse changed? When?
  • What is the ratio between Ecological Surfaces Areas (EFAs) and Productive areas in a given place?

Processing this kind of information could be very complex and laborious. It depends on selected indexes, chosen bands and geographical area. Furthermore, the processing is complicated by the high volumes of data. However, final results will offer a very detailed and accurate overview about land cover changes, environmental monitoring, crop monitoring, food security and detailed vegetation & forest monitoring parameters as leaf area index, chlorophyll concentration or carbon mass estimations. All this information and results have direct relation with Common Agricultural Policy principles and new European “Greening” policies.

Note: Some details about the characteristics and features of these instruments are available here.

Recent proDataMarket presentations

 

 

 

 

Ontotext in the proDataMarket Project

Ontotext is a SME founded in 2000 in Sofia, Bulgaria. For more than a decade Ontotext has successfully delivered Semantic Technology products and solutions that improve data integration, data management and search within enterprises. Key products of Ontotext include GraphDB – one of the leading enterprise RDF graph databases, and the Self-Service Semantic Suite (S4) – a platform for on-demand smart applications and data management. Ontotext is also delivering semantic data management solutions to organizations in various verticals: media & publishing, healthcare & life sciences, museums and digital libraries.

Ontotext’s vision for smart data management is based on using ontologies and vocabularies for modelling data, analyzing free flowing text content and extracting structured information and facts. The RDF graphs data model provides an agile way to manage and query heterogeneous data, and powerful semantic search can be implemented on top of the graph data. This way the business users can ask more complex questions and find more precise answers, than by using traditional enterprise full-text search approaches.

Ontotext is one of the technology partners in the proDataMarket project. Ontotext’s responsibilities include delivering a scalable data management infrastructure, which will allow for data stored in various legacy data sources to be transformed into RDF graphs with proper metadata mappings to popular ontologies and vocabularies. A scalable RDF database-as-a-service running in the Cloud will be one of the key components of the proDataMarket infrastructure, and it will enable quick deployment of new data services on top of 3rd party datasets. This way, the numerous data publishers will not need to deal with the overhead of provisioning and maintaining the access to their data, while developers will get an easy, instant and reliable access to valuable property related via simple RESTful APIs.

Istat and Statistics in proDataMarket project

Istat, the Italian National Institute of Statistics is a public research organization and is the main producer of Official Statistics in Italy.  It  was created  in 1926 under the name of “Central Institute of Statistics”.  The idea was to have a single and independent body to produce social and economic data useful for citizens, policy makers and in general for the whole Country. With more than 2000 employees, one main office in Rome and 18 regional  offices,  Istat produces constantly high quality data about social and economic phenomena. Istat is also the main organization inside the National Statistical System, a national network that connects together Public Administrations, local and regional bodies, Chambers of Commerce and other local organization’s statistical departments.   Istat is also one of the members of the European Statistical Network, an international  Community lead by Eurostat that works not only to provide comparable statistics at the European level but also promoting common projects for data harmonization, data sharing and statistical data transmission.

What we do

Statistics  means “science of the state”. By means of statistics, governments all over the world measure economic and social development as well as citizens wellbeing: living conditions, health, work, education, environment, social relationships. All official statistics are produced using administrative archives, sample surveys and censuses. Census is the most important and complex survey carried out by Istat. The last population and housing census  was carried out in October 2011 involving more than 24 millions households and more than 14 million buildings.

Role in proDataMarket project

Istat not only produces statistics to support the government or local administrations for decision making processes, but also to provide high quality data for universities, researchers, professionals, journalists, businesses. The awareness about the value of data and about the opportunities that data can offer to citizens and in general to the market is the main reason for Istat to join and to support the proDataMarket project. In this project, the Istat role is not only as a data provider, but also to support the project for data quality production as well as for methodological aspects when data are used in the expected business cases. More precisely, Istat will support business cases related to real estate market. The goal is to make data more accurate and suitable to generate new services or to improve existing services built on data,  making in the same time this market more transparent and useful for citizens and businesses.