The datasets and challenges in integration
The State of Estate (SoE) business case focuses on generating an up-to-date, dynamic and high quality report on State-owned properties and buildings in Norway. It collects and integrates several datasets as listed below. The datasets are originated from heterogeneous sources and of different quality. Here are some scenarios that will cause challenges in the integration process.
Though Matrikkel data from the Norwegian mapping authority is one the most authoritative source of property data, not all the information is up to date. It could be sometimes caused by the delay of administrative procedure in municipalities, and sometimes the owners don’t report change to the municipalities because of the high cost to report the change, and sometimes it could be typos and some other manual updating errors. The buildings less than 15 square meters are not required to be registered in the Matrikkel.
Statsbygg’s property data
The Statsbygg’s property data is updated since the last report. However, the Matrikkel’s building number is not correctly registered on all the buildings. The address information is not necessarily updated either. It could be also be typos and some other manual updating errors in the dataset.
Business Entity register
The Business Entity register dataset is from another national authoritative source with information of ministries and their subordinate organizations. However, not all the subordinate organizations of the ministries are registered as a sub-organization in the Business Entity register. The missing organizations need to be added manually as extra business entities to the dataset.
State-owned properties Report 2013-2014 (SoEReport2013)
The SoEReport2013 is a report from 2013 and it includes some properties or buildings that could be sold, rebuilt, demolished in the last few years. The old report also includes some non-reported ownership of properties and buildings in the government that we need to take care of in the new report. For example several properties were registered as owned by Statsbygg in the old report; however, they are registered as owned by the King in the Matrikkel database, which means that Statsbygg has taken care of the King’s property without reporting to the municipalities that ownership has changed.
The Matrikkel’s building number has not been registered on all the buildings in the ByggForAlle dataset and some of the key information could include typos, manual updating errors or be out-of-date too.
The data workflow
To meet the challenges in the data integration, we’ve developed a data workflow as shown in the diagram below. It illustrates the process of importing the datasets, quality control and integration of the datasets, and finally generating the result dataset. The involved roles and their activities are modelled as swimming lanes. The original and generated datasets are modelled as dataobjects in the diagram such as SoEReport2013, BusinessEntityRegister, NewOrgList_Comfirmed etc. The quality control process can be both machine automated and manual work based on human tasks and it will take care of the integration exceptions.
There are 3 roles involved in this process.
- The SystemAdmin is a technical role and its main tasks are dataset import and integration.
- The SystemManager is a functional role that has the main task of quality control and generating the SoE report including organizing and communication tasks with other involved organizations.
- The PropertyResponsible is a role for each involved organization and its main task is to prepare data, quality control and submit its own property-list and building-list.
The activity boxes are explained as below:
- ImportOldReportWithOrgList: SystemAdmin starts with checking if the SoE report from 2013 is imported. If not, the SystemAdmin imports the report which also includes the old organization list.
- ImportMinistrySub_Brreg: Then the SystemAdmin imports the organization list of the Ministries and subordinate organizations from the Business Entity Register.
- MergeOrgListBrreg_SoEReport2013: The two organization lists are merged.
- EditComfirmOrgList: The SystemManager will get signal to start editing and updating the list, the result will be the confirmed OrgList.
- ImportOwnedPropertyBuildingFromMatrikkelBasedOnOrglist_Comfirmed: Based on the confirmed OrgList, the owned properties and buildings from the Cadastre database (Matrikkel) are imported by the SystemAdmin.
- PrepareExportForOwned: The property responsible will prepare a property list in a format as agreed.
- ImportOwnedFromOrg: If some of the organizations such as Statsbygg have their own database or list of owned properties and buildings the lists will be imported as necessary.
- ImportByggForAlleData: Then the ByggForAlle data is imported.
- MergeAllDatasets: Afterwards data from Matrikkel and Business Entity Register (OrgList_comfirmed), the SoE reports 2013, properties data from organizations such as Statsbygg, ByggForAlle are merged by the SystemAdmin.
- QualityControlMergedList: The SystemManager will then start the quality control cycle of the merged list.
- EditAndConfirmOwnedList: The property responsible in each organization will get the task to edit and confirm their property and building list.
- ApproveAndFinalizeNewSoEReport: The SystemManager will do the final quality control before approving and finalizing the new SoE Report.
Expected results and an example
Here below is one of the expected result from data quality control and integration in the step of “MergeAllDatasets”. The maps below shows both the examples of properties on the SoEReport2013 but not on the list based on Matrikkel_Brreg integration, and the properties on the Matrikkel_brreg integration but not on the SoEReport2013. After identifying the mismatches in this way, the users can work further on to clean the datasets to correct the wrong registrations in the source systems.
||BRREG_Matrikkel integrated dataset
||Old SoE Report
||“, NORSK INST.FOR SKOG OG LANDSKAP, NORSK INSTITUTT FOR SKOG OG LANDSKAP”
“,BIOFORSK, TOLLEFSRUD MARI METTE”
||“STATENS VEGVESEN, ,STATENS VEGVESEN”
|land parcels filled with solid color
The figure below shows that inside the Campus Ås. Some land parcels owned/leased by NMBU and Statens vegvesen according to Matrikkel are not included in the old SoE report, those land parcels are marked with crosshatch pattern. On the other side, some land parcels from the old SoE report are not included in the list based on BRREG and Matrikkel, such as the hatched land parcel with the label “, NORSK INST.FOR SKOG OG LANDSKAP, NORSK INSTITUTT FOR SKOG OG LANDSKAP” or “,BIOFORSK, TOLLEFSRUD MARI METTE”. Both of the simple hatch and cross hatch properties in the map need to be quality check and confirmed by the step of “QualityControlMergedList” and thereafter “EditAndConfirmOwnedList”.