New proDataMarket paper: Combining Sentinel-2 and LiDAR data for objective and automated identification of agricultural parcel features

Combining Sentinel-2 and LiDAR data for objective and automated identification of agricultural parcel features by Jesús Estrada, Héctor Sanchez, Lorena Hernanz, María José Checa and Dumitru Roman

This new proDataMarket paper explains how a comprehensive strategy combining remote sensing and field data can be helpful for more effective agriculture management. Satellite data are suitable for monitoring large areas over time, while LiDAR provides specific and accurate data on height and relief. Both types of data can be used for calibration and validation purposes, avoiding field visits and saving useful resources. In this paper we propose a process for objective and automated identification of agricultural parcel features based on processing and combining Sentinel2 data (to sense different types of irrigation patterns) and LiDAR data (to detect landscape elements). The proposed process was validated in several use cases in Spain, yielding high accuracy rates in the identification of parcel features. An important application example of the work reported in this paper is the European Union (EU) Common Agriculture Policy (CAP) funds assignment service, which would significantly benefit from a more objective and automated process for identification of agricultural parcel features, thereby enabling the possibility for the EU to save significant amounts of money yearly.

Although some issues regarding the generation and improvement of agricultural property datasets were already explained in our previous blog entry (Data workflow in CAPAS), this paper highlights the current results of generation and usage of this new information.

Irrigation patterns map, obtained using Sentinel-2 Process 

The main result of this analysis is how the use of the external, and usually underused, data sources offers a powerful and accurate tool for generating new contrast and validation data for the information used by Spanish CAP Payment Agency, in order to provide a better service to landowners and farmers. As a conclusion, the use of Sentinel-2 series and LiDAR can help to detect areas that are not eligible for grant assignment, support cross-check, and these datasets can be used as a tool for choosing field samples.

The document is available Here.

Data Workflow in CAPAS

Description of the data workflow processes

TRAGSA, as a business case provider in the project, is developing the CAPAS service which aims at publishing  and integrating multi-sectorial data from several sources into an existing data-intensive service, targeting better Common Agriculture Policy (CAP) funds assignments to farmers and land owners. The goal is to leverage the data integration facilities offered by proDataMarket, to better define the funds assignments features in parcels and subplots.

CAPAS is working on an improvement of the efficiency and competitiveness of the existing Spanish CAP (Common Agriculture Policy) service by integrating more datasets, underused at the beginning of the proDataMarket project. To use them as a powerful tool, it was necessary to create and develop new data processing algorithms. Therefore, CAPAS is not only an end-user application. Indeed, it involves data collection, data modelling and data processing techniques.

The CAPAS Business Case is oriented towards the replacement of human-generated  (subjective) data with more objective data that can be collected and integrated from different cross-sectorial sources in an automated way.

At least two external datasets (LIDAR and Copernicus SENTINEL2) are being used to improve the agricultural cadastre Spanish database. The economic value generated by this process and its relation to CAP funds assignment will be evaluated during the next year, in the final phase of the project.

Managing LIDAR data

LIDAR files are a collection of points stored as x, y, z which represent longitude, latitude, and elevation, respectively. This data is hard to process for non-specialists. To use them as a powerful tool to define objectively the parameters of agricultural use of parcels and the presence of landscape elements, a new data processing and treatment algorithm has been created.

This algorithm classifies and groups the cloud of points in order to simplify the huge amount of data. The clouds of points are topologically processed to obtain connected areas as polygons or to maintain them as single points. In conclusion, LIDAR datasets are transformed into new raster and vector files, more popular data types, and easier to be dealt with. The overlaps and intersections of the new datasets produced (as Landscape elements) will define the CAP parameters for a specific subplot or parcel.

Managing Satellite data

The Sentinels are a fleet of satellites designed specifically to deliver the wealth of data and imagery that are fundamental to the European Commission’s Copernicus program. The use of satellite images in CAPAS has already been explained in this blog entry.

Description of the source datasets and result dataset

The main source datasets of Business Case CAPAS and main processes used to obtain output datasets are explained below:

LIDAR files

LIDAR files can be available under two different formats: .las and .laz. The LAS file format is a public file format commonly used to exchange 3-dimensional point cloud data between data users, being LAS just an abbreviation of LASER. LAZ files, due to the big size of LAS files, is the zipped version of the LAS format.

Although developed primarily for exchange of LIDAR point cloud data, LAS format supports the exchange of any 3-dimensional x,y,z tuples. This format maintains information specific to the LIDAR nature of the data while not being overly complex.

Technical description of LIDAR format
Technical description of LIDAR format

In the context of the ProDataMarket Project, LAS files used in the CAPAS business case will just be a collection of points (latitude, longitude, elevation).

Spanish LIDAR information is freely and openly available at http://centrodedescargas.cnig.es/CentroDescargas/buscadorCatalogo.do?codFamilia=LIDAR

SENTINEL files

The information to be used in CAPAS business case is the Image Data (JPEG2000) provided by Copernicus at Sentinels Scientific Data Hub (https://scihub.copernicus.eu/). The description of JPEG2000[1] format is beyond the aim of this blog entry but some general ideas will be described.

Sentinel data are freely and openly available at:

https://sentinel.esa.int/web/sentinel/sentinel-data-access/access-to-sentinel-data

More information and general factsheet at: https://earth.esa.int/documents/247904/1848117/Sentinel-2_Data_Products_and_Access.

SIGPAC Database

SigPAC database is a complex information system that covers the whole Spanish geography and all agricultural activities and others related to Biodiversity and nature conservation.

In regards to SigPAC database, the main datasets produced or modified by CAPAS are:

  • Landscape Elements
  • Parcels and Subplots

The level of accessibility of SigPAC database varies depending on Autonomous Communities. For example, it is open and freely available in Castile at http://www.datosabiertos.jcyl.es/web/jcyl/set/es/cartografia/SIGPAC/1284225645888

Data workflow process for CAPAS

The following data workflow, as shown in the diagram below, illustrates the evolution of the different datasets, their transformations and their integration to generate the final result datasets.

CAPAS Workflow
CAPAS Workflow


LIDAR processing

The Grouping process gathers the LIDAR points using the following rules:

  • Errors, noise and overlaps are not taken into account (Classifications 1, 4, 7 and 12). As a consequence, more than 50% of points are removed from the process.
  • Soil, water and buildings have their own groups
  • Classification 19 is considered as short trees
  • Classification 20 is considered as medium trees
  • Classification 21 are 22 are grouped as tall trees

The result of this process is still a LAS file. The following image shows how LIDAR points (green points) have been processed and classified (Green points as trees, red points as soil, orange and yellow as bushes).

lidar-1

The next steps, such as Rasterization or Vectorization, involve topological rules in order to group the points to generate squares (raster) that would be processed to obtain the final vector shapefile.

The following image shows how LIDAR points have been grouped to create topologically connected surfaces. In the image below, yellow areas are Soil, orange are Bushes, green are Trees. Grey areas and blue surfaces (not present in this image) are Buildings and Water, respectively.

lidar-2

Once the trees class is defined in a raster format by LiDAR data, it wasrefined thanks to Sentinel Data which has more updated information. RGB and NDVI products help to identify which pixels have an NDVI value over 0.5 and it could be detected by RGB product in order to check which pixels represent vegetation areas.

Finally, trees auxiliary layer refined by Sentinel is processed to obtain different configurations:

  • Isolated trees
  • Copses

The final result of the process is a vector ESRI shape file, where the copses layer is a polygon feature type and the isolated trees layer is as point feature type. All of them have a direct correspondence with the landscape elements.

The overlaps between detected landscape elements, currently protected sites of Natura 2000 network and the Land Parcel Identification System allows performing an accurate ecological value report for Spanish crops areas.

LiDAR algorithm allows to obtain more detailed information because the landscape value helps to identify which subplot has more value per parcel, obtaining the following benefits:

  • Farmers will get an economical profit through fund-assignments to maintain these trees forms, and
  • the ecosystem and its species will be preserved.

ecological-value

This Ecological value report has been developed regarding the following queries:

  • Query 1: Surface of Sites of Community Importance (LIC) / subplot area.

Score between 0 and 1.

  • Query 2: Surface of Special Protected Areas for Birds (ZEPA) / subplot area.

Score between 0 and 1.

  • Query 3: Protected Sites Value = Sum of query 1 + query 2. Score between 0 and 2.
  • Query 4: Number of Isolated tree / subplot area. Score between 0 and 1.
  • Query 5: Surface of copses area / subplot area. Score between 0 and 1.
  • Query 6: Landscape Elements Value = Sum of query 1 + query 2. Score between 0 and 2.
  • Query 7: Ecological Value = Sum of query 3 + Query 6.

Sentinel Products generation

In the first place, Sentinel 2 (S2) imagery has to be downloaded from the ESA server. In the automatic download process developed, selection parameters were incorporated in order to download only the imagery that satisfies our quality criteria. Two kinds of products are generated from S2 imagery.

  • Simple products: Those which have been generated with one-date imagery. By an automatic process, TRAGSA is generating RGB products for supporting photo interpretation. Another simple product generated is the Normalized Difference Vegetation Index (NDVI) which is widely used for vegetation monitoring.
  • Complex products: Those which are generated with imagery from different dates. The following four thematic layers are going to be created.
    • Permanent grassland: This layer will be useful to determine photosynthetically active vegetation and non active (unproductive or bare soil) areas. Therefore it will help to monitor the maintaining of existing permanent grassland, which is an agricultural beneficial practice for the climate and the environment (REGULATION (EU) No 1307/2013).
    • Herbaceous and woody crops: By using decision algorithms, different crops can be identified. The results will be displayed in two different layers, one for herbaceous crops and other for woody crops.
    • Change detection layer: This layer will highlight areas where changes have happened. The layer will be focused on forests and grassland areas in order to detect dramatic changes, such as those caused by logging or forest fires, as well as to detect more subtle changes associated with AIS (Alien Invasive Species), diseases and reforestation.

Hitherto, only one of the twin S2 satellites (Sentinel 2A) has been launched. When the second satellite (Sentinel 2B) is on orbit, the revisit time at the equator will be 5 days which results in 2-3 days at mid latitude. This high revisit time will offer a quicker updating of SigPAC database in comparison with current updates that are based on low precision data (LANDSAT and SPOT5 satellites) or ortophoto flights generated by each Autonomous Community.

Final Result

As stated previously, Common Agriculture Policy funds Assignments Service (CAPAS) is a set of tools that improves the existing Common Agriculture Policy service (CAP), in order to innovatively manage and upgrade the CAP database provided by Spanish Administration to farmers and land owners. It is important to note that this CAP database is one of the main pillars of the CAP funds calculation systems. As mentioned earlier, the improvement process is based on the leverage of new cross-sectorial data sources from different fields and geographical areas, and the result datasets will be also available at the proDataMarket marketplace.

To use these new datasets as a powerful tool to define objectively the parameters of agricultural use of parcels, presence of landscape elements or temporal evolution of crops, the explained data processing and treatment algorithms have been, at the moment, partially developed.

As a summary, the usage of LIDAR files modifies some Parcel and Subplots features, and SENTINEL images will improve the definition of Parcel and Subplots land use and its temporal evolution.

The new datasets produced by CAPAS using those external sources will be RDFized and incorporated to proDataMarket platform. Therefore, Spanish rural property data, improved using new and underexploited datasets, will be accessible through proDataMarket platform providing the users with advanced visualization and querying features.

[1] JPEG 2000 (JP2) is an image compression standard and coding system. It was created by the Joint Photographic Experts Group committee in 2000

proDataMarket at the European Data Forum 2016

On 29 and 30 June proDataMarket participated in the European Data Forum (EDF) 2016, organized by Amsterdam Data Science and Technical University of Eindhoven under the auspices of the Dutch presidency of the European Union.

Evoluon

The conference, held in the Conference Center and former museum of Science and Technology Evoluon (Eindhoven, NE), was attended by Commissioner Günther Oettinger, the Rector of University of Tilburg and Philips, Siemens and TomTom CEOs. The event brought together more than 600 attendees from across Europe and multiple technology sectors.

General View

Likewise, proDataMarket presented a descriptive poster of the project, explaining its development and the conclusions reached so far in the different business cases and data-marketplace central infrastructure, and how proDataMarket aims to disrupt the PD market and demonstrate innovation across sectors where Property Data is relevant, by integrating technical framework for effective publishing, data consumption and showcasing data-driven business products.

poster

Besides the main event, the IQmulus project organized a workshop addressing Geospatial, Mathematical and Linked Big data. This event addressed aspects of big data where geolocation, geospatial or mathematical structures have a central role. In this side-event, the project coordinator, Dr. Dumitru Roman, also explained the whole project and its Business Cases.

Satellite images applied to property data

The Sentinels are a fleet of satellites designed specifically to deliver the wealth of data and imagery that are central to the European Commission’s Copernicus programme . This unique environmental monitoring programme is making a step change in the way we manage our environment, understand and tackle the effects of climate change and safeguard everyday lives. Sentinel-2 carries an innovative wide swath high-resolution multispectral imager with 13 spectral bands for a new perspective of our land and vegetation. The combination of high resolution, novel spectral capabilities, a swath width of 290 km and frequent revisit times is generating unprecedented views of Earth. Sentinel-2 is providing information for agricultural and forestry practices and for helping manage food security. Satellite images will be used to determine various crop and plant indexes. Some examples of these parameters could be:

  • Normalised Difference Vegetation Index (NDVI)
  • Normalised Difference Snow and Ice Index (NDSI)
  • Enhanced vegetation index (EVI)

This is particularly important for effective crops production prediction and applications related to Earth’s vegetation.

SentinelExampleSentinel use example

Sentinel-2 is the first optical Earth observation mission of its kind to include three bands in the ‘red edge’, which provide key information on the state of vegetation. In the previous image from 6 July 2015 acquired near Toulouse, France, the satellite’s multispectral instrument was able to discriminate between two types of crops: sunflower (in orange) and maize (in yellow).
These new and advanced datasets will be used inside CAPAS Business case to improve and enrich the information already obtained using LIDAR datasets (What is LIDAR?). Indeed, using LIDAR is possible to obtain accurate surface maps. However, data updates frequency is not very high. On the other hand, Sentinel constellation has a very high revisit frequency (five days) and offers information about kind of crops and their evolution. In conclusion, the use and merging of those different datasets answer several question regarding CAP parameters:

  • Is a specific parcel cultivated?
  • What kind of crop is growing in a plot?
  • Has the number of trees of a copse changed? When?
  • What is the ratio between Ecological Surfaces Areas (EFAs) and Productive areas in a given place?

Processing this kind of information could be very complex and laborious. It depends on selected indexes, chosen bands and geographical area. Furthermore, the processing is complicated by the high volumes of data. However, final results will offer a very detailed and accurate overview about land cover changes, environmental monitoring, crop monitoring, food security and detailed vegetation & forest monitoring parameters as leaf area index, chlorophyll concentration or carbon mass estimations. All this information and results have direct relation with Common Agricultural Policy principles and new European “Greening” policies.

Note: Some details about the characteristics and features of these instruments are available here.

TRAGSA Group in proDataMarket project

The Tragsa Group forms part of the group of companies of the State-owned holding company Sociedad Estatal de Participaciones Industriales (SEPI).

It is incorporated by Empresa de Transformación Agraria, S.A. (Tragsa), the parent company founded in 1977 for the performance of rural development works and services, environmental conservation and emergency relief operations. Its first subsidiary Tecnologías y Servicios Agrarios, S.A (Tragsatec), which was established in 1990 for carrying out consulting and engineering projects, and Colonización y Transformación Agraria, S.A (CYTASA), incorporated in Paraguay in November 1978. Recently, in 2013, the company Tragsa Brasil Desarrollo de Proyectos Agrarios, LTDA was created.

Its 37 years of experience working for the public authorities to the service of society have placed this business group at the forefront of the different sectors in which it operates, from the provision of agricultural, forestry, livestock, and rural development services, to the conservation and protection of the environment.

The company’s broad national reach, which has branches in all the provinces of Spain’s 17 Autonomous Communities, allows it to respond independently, quickly and effectively to any urgent requirement of the central, regional or local government.

What we do

The Tragsa Group provides comprehensive solutions to the needs of public administrations as regards environmental issues, rural development and management of natural resources, with proven responsiveness.

Thanks to its extensive national and international experience, it can meet the requests of its clients, providing those unique factors that identify the Group with quality, such as its large, highly qualified staff, its steadfast commitment to innovation in R&D&I, its commitment to the environment and a lasting respect toward society.

Its activity in the domestic market, which accounts for 96% of its volume of business, has been mainly focused in recent years on environmental activities (30%), rural infrastructure (17%), information technologies (15%) and irrigation, water management and technology and agro-processing facilities and rural equipment (13%).

The Tragsa Group’s priority is respect for and commitment to the environment, while minimizing the environmental impact of our activities, establishing alternative measures that are respectful of the environment, and contributing by the very nature of our activity to the preservation and conservation of biodiversity.

Role in proDataMarket project

Due to the nature of TRAGSA as state owned company, one of company’s main commitments is not only keeping the information, but also returning them to its final owner (Public Administration) increased and improved. Furthermore, TRAGSA services must offer rural society growth opportunities and development choices. Therefore, proDataMarket, aligned with our strategic goals, also offers TRAGSA an opportunity of extending the knowledge of the company in Big Data and its reutilization. TRAGSA participates as ProDataMarket data supplier, providing the project with several data resources from Spain. The data sources are varied and related with several fields as cadastre, agricultural parcel, land cover, land use and environmental information among others.

TRAGSA, as business case provider in the project, will develop the CAPAS service which focus is to publish and integrate multi-sectorial data from several sources into an existing data-intensive service targeting better and fairer Common Agriculture Policy (CAP) funds assignments to farmers and land owners. The goal is to leverage the data integration facilities offered by proDataMarket, to better define the funds assignments.