The proDataMarket Ontology: Enabling Semantic Interoperability of Real Property Data

Real property data (often referred to as real estate, realty, or immovable property data) represent a valuable asset that has the potential to enable innovative services when integrated with related contextual data (e.g., business data). Such services can range from providing evaluation of real estate to reporting on up-to-date information about state-owned properties. Real property data integration is a difficult task primarily due to the heterogeneity and complexity of the real property data, and the lack of generally agreed upon semantic descriptions of the concepts in this domain. The proDataMarket ontology is developed in the project as a key enabler for integration of real property data.

The proDataMarket ontology design and development process followed techniques and design choices supported by existing methodologies, mainly the one proposed by Noy [1]. Requirements are extracted from a set of relevant business cases and competency questions [2] are defined for each business case, so as core concepts and relationships. A conceptual model is then developed based on the requirements mentioned above and international standards including ISO 19152:2012 and European Union’s INSPIRE data specifications. For example, the LADM conceptual model from ISO 19152:2012 is used as reference model to the proDataMarket cadastral domain conceptual model. Afterwards we implemented the conceptual model using RDFS/OWL linked data standard. RDFS is used to model concepts, properties and simple relationships such as rdfs:subClassOf. OWL is built upon RDFS and provides a richer language for web ontology modelling and it is used to model constraints and other advanced relationships, such as the cardinality constraint needed to express the relationship between properties and buildings.

The proDataMarket ontology can be accessed at http://vocabs.datagraft.net/proDataMarket/. The ontology has been divided into several sub-ontologies (see Table below), reflecting the cross-domain nature of the requirements. This modular approach also helped to handle the complexity of the model and made it easier to maintain. In the current version, there are 11 sub-ontologies with 43 native classes and 43 native properties.

Table: Composition of the proDataMarket ontology

Domain/module Namespace prefix URL Classes Properties Business cases
Common prodm-com http://vocabs.datagraft.net/proDataMarket/0.1/Common# 4 4 ALL
Cadaster prodm-cad http://vocabs.datagraft.net/proDataMarket/0.1/Cadastre# 6 16 SoE, RVAS, NNAS, SIM
State of Estate Report prodm-soe http://vocabs.datagraft.net/proDataMarket/0.1/SoE# 4 2 SoE, RVAS
Business Entity-Reuse the existing vocabularies, no new classes and properties 0 0 SoE, RVAS
Building Accessibility-Reuse the existing vocabularies, no new classes and properties 0 0 SoE
Natural Hazard prodm-nh http://vocabs.datagraft.net/proDataMarket/0.1/NaturalHazard# 1 0 RVAS
Land Parcel Identification System (LPIS) prodm-lpis http://vocabs.datagraft.net/proDataMarket/0.1/LPIS# 1 7 CAPAS
Sentinel data prodm-sen http://vocabs.datagraft.net/proDataMarket/0.1/Sentinel# 1 1 CAPAS
Landscape Elements (LiDAR data) prodm-lid http://vocabs.datagraft.net/proDataMarket/0.1/Lidar# 3 0 CAPAS
Assessment prodm-asm http://vocabs.datagraft.net/proDataMarket/0.1/Assessment# 3 3 CAPAS
CensusTract prodm-ct http://vocabs.datagraft.net/proDataMarket/0.1/CensusTract# 1 0 CST,CCRS
Urban Infrastructure prodm-ui http://vocabs.datagraft.net/proDataMarket/0.1/UrbanInfrastructure# 17 10 SIM
Protected Sites prodm-ps http://vocabs.datagraft.net/proDataMarket/0.1/ProtectedSite# 2 0 CAPAS
Total: 43 43

More than 30 datasets have been published through the DataGraft platform [3] [4] using the proDataMarket ontology as a central reference model. All seven business cases use the proDataMarket ontology in data publishing.

More details on the proDataMarket vocabulary will be found in the paper “The proDataMarket Ontology for Publishing and Integrating Cross-domain Real Property Data” that was accepted for publication in scientific journal Territorio ItaliaLand AdministrationCadastre and Real Estate [5].

References

  • [1] Noy, Natalya F., and Deborah L. McGuinness. “Ontology development 101: A guide to creating your first ontology.” (2001).
  • [2] Grüninger, Michael, and Mark S. Fox. “Methodology for the Design and Evaluation of Ontologies.” (1995).
  • [3] Roman, D., et al. DataGraft: One-Stop-Shop for Open Data Management. 2017. Semantic Web, vol. Preprint, no. Preprint, pp. 1-19, 2017. DOI: 10.3233/SW-170263.
  • [4] Roman, D., et al. DataGraft: Simplifying Open Data Publishing. ESWC (Satellite Events) 2016: 101-106.
  • [5] L. Shi, N. Nikolov, D. Sukhobokb, T. Tarasova and D. Roman. “The proDataMarket Ontology for Publishing and Integrating Cross-domain Real Property Data”. To appear in the journal “Territorio Italia Land Administration, Cadastre and Real Estate”. n.2/2017.

Recent proDataMarket presentations

 

 

 

 

SINTEF: Project leader and technology provider in proDataMarket

SINTEF is Scandinavia’s largest independent research organization. SINTEF is multidisciplinary, with international top-level expertise in a wide range of technological and scientific disciplines, including areas such as ICT, medicine, and the social sciences. SINTEF’s company vision is “technology for a better society”, and it is an important aspect of SINTEF’s societal role to contribute to the creation of more jobs. SINTEF acts as an incubator, commercialising technologies through the establishment of new companies. SINTEF is represented in proDataMarket by Information and Communication Technology (SINTEF ICT) through the department for Networked Systems and Services (NSS).

Role in the project: SINTEF is the project leader of proDataMarket, and in addition serves as a technology provider in the project. SINTEF’s technical focus is on the technical infrastructure of the proDataMarket platform related to data management technologies, in particular data publishing and access, helping organizations with cost-effective solutions for (linked open) data management. Our goal is to promote standardisation with mechanisms for defining structure and semantics of data, as well as improve the interoperability and transparency among data publishers and consumers through leveraging the linked data format. Technically, we are constructing a software framework that consists of a frontend and a set of platform services that support reusable data cleaning and reconfiguration based on pluggable static, dynamic or streaming input in various formats (e.g., relational databases, CSV files, WMS/WFS services, etc.). Outputs will be published on the proDataMarket platform and available to end users and other publishers through a secured set of platform services such as SPARQL query endpoints and RESTful APIs. This framework is meant to provide automation for significantly reducing the manual effort involved in the highly laborious process of data retrieval and aggregation.

In proDataMarket, SINTEF reuses and extends its data reconfiguration solutions from the DaPaaS project. In particular, we plan to further develop the Grafterizer tool for data cleaning and linked data mapping of tabular inputs.