Home › 
Blog › 
The best (Python) tools for remote sensing

The best (Python) tools for remote sensing

Raster Data
Satellite Data
Remote Sensing
Vector Data
Computer Vision
Sentinel 2

Written by Emilius Richter

Published on November 3rd, 2021

Introduction

An estimated 906 of Earth observation satellites are currently in orbit, providing science and industry many terabytes of data every day. The satellites operate with both radar as well as optical sensors and cover different spectral ranges with varying spectral, spatial and temporal resolution. Due to this broad spectrum of geospatial data, it is possible to find new applications for remote sensing methods in many industrial and governmental institutions. On our website, you can find some projects in which we have successfully used satellite data, and possible use cases of remote sensing methods for various industries.

Well-known satellite systems and programs include Sentinel-1 (radar) and Sentinel-2 (optical) from ESA, Landsat (optical) from NASA, TerraSAR-X and TanDEM-X (both radar) from DLR, and PlanetScope (optical) from Planet.

There are basically two types of geospatial data: raster data and vector data.

Raster data

Raster data are a grid of regularly spaced pixels, where each pixel is associated with a geographic location, and are represented as a matrix. The pixel values depend on the type of information that is stored, e.g., brightness values for digital images or temperature values for thermal images. The size of the pixels also determines the spatial resolution of the raster. Geospatial raster data are thus used to represent satellite imagery. Raster images usually contain several bands or channels, e.g. a red, green and blue channel. In satellite data, there are also often infrared and/or ultraviolet bands.

Vector data

Vector data represent geographic features on the earth's surface, such as cities, country borders, roads, bodies of water, property rights, etc.. Such features are represented by one or more connected vertices, where a vertex defines a position in space by x-, y- and z-values. A single vertex is a point, multiple connected vertices are a line, and multiple (>3) connected and closed vertices are called polygon. The x-, y- and z-values are always related to the corresponding coordinate reference system (CRS) that is stored in vector files as meta information. The most common file formats for vector data are GeoJSON, KML and SHAPEFILE.

In order to process and analyze these data, various tools are required. In the following, I would like to present the tools with which we at dida have had the best experience and which are regularly used in our remote sensing projects. I present the tools one by one, grouped into the following sections:

  • Requesting satellite data

  • Processing raster data

  • Processing vector data

  • Providing geospatial data

  • Processing meteorological satellite data

Requesting satellite data

EOBrowser

EOBrowser is a web application from sentinelhub, which can be used to display various satellite data, such as Sentinel-1, Sentinel-2, Landsat, MODIS etc.. The desired region for which the data should be displayed can be selected via the search bar, by uploading KML or GeoJSON files or by drawing polygons. The "Advanced search" button can be used to adjust certain parameters that refine the search. The search provides a link to download the corresponding raw data as well as different visualizations.

For Sentinel-2 data, for example, the RGB image can be displayed, but also the visualized Normalized Difference Water Index (NDWI) or Normalized Difference Vegetation Index (NDWI).

Sentinelsat

Sentinelsat is a Python package that can be used to request sentinel data from the Copernicus Open Access Hub. The implementation is very simple and the request is based on parameters like region (e.g. in form of a GeoJSON file), date, platform (e.g. Sentinel-2), product type (e.g. L1C), maximum cloud cover, etc.

However, sentinelsat does not provide tools to process the downloaded raw data and convert it into suitable formats. This can be implemented relatively easily with the help of rasterio.

Sentinelhub

The Python package sentinelhub is the Python interface to use the sentinelhub services. The methods and functions allow to download and process the different supported satellite data (same as EOBrowser). For example, there are methods to display images with a cloud mask or mosaics of the least cloudy images. The many functions are described in detail in both the Python and the API documentation in an understandable way. Sentinelhub also offers to provide satellite data and metainformation via the OGC services WMS, WMTS, WCS and WFS.

Unfortunately, many features of sentinelhub are not available for free. Depending on the plan some features cannot be used and there are limits on data requests.

Processing raster data

Rasterio

Rasterio is a Python package that can be used to process geospatial raster data, such as GeoTIFFs. For this purpose many modules and functions are available, for example to process raw data from satellites, to read raster data, to retrieve their geographic metadata, to transform their coordinates, to crop the images, to merge multiple images and to save the data in other formats. This large number of functions and the ease of implementation make Rasterio a standard tool in the analysis of satellite data.

The code example below shows how to use Rasterio to read a Sentinel-2 scene and display and save it as an RGB image:

import numpy as np
import rasterio
from rasterio import plot

file = 'S2BMSIL2A20210903T144719N0301R139T19LDG20210903T184347.tif'
# opening the GeoTIFF file and reading the R, G and B bands
with rasterio.open(file) as src:
	b, g, r = src.read(1), src.read(2), src.read(3)
	rgb = np.array([r, g, b])
	meta = src.meta.copy()

# updating the metadata to 3 bands
meta.update({'count': 3})

# saving the data as new GeoTIFF
with rasterio.open(file[:-4] + '_rgb.tif', 'w', **meta) as dest:
	dest.write(rgb)

# adjusting the RGB image for creating a nice plot
rgb[rgb > 3000] = 3000
img = ((rgb / rgb.max()) * 255).astype(np.uint8)
plot.show(img)

The output should look like this:

Pyproj

The Python library pyproj is a Python interface for PROJ, a software library to transform geographic coordinates from one coordinate reference system (CRS) to another. Both geodetic transformations and cartographic projections are supported. Pyproj is a very simple tool and essential in remote sensing projects to unify coordinate reference systems that may arise from different data sources used.

The following example shows how to perform a simple coordinate transformation of the boundaries of the Sentinel-2 image above:

import rasterio
import pyproj

file = 'S2B_MSIL2A_20210903T144719_N0301_R139_T19LDG_20210903T184347.tif'
# reading the GeoTIFF file and returning the bounds and the coordinate reference
# system (CRS) which is EPS:32719 for Sentinel-2
with rasterio.open(file) as src:
    bounds = src.bounds
    crs_from = src.crs

# defining the CRS to be transformed into
epsg_id = 4326
crs_to = pyproj.CRS.from_epsg(epsg_id)

# defining the transformer which transforms from one CRS to another
proj = pyproj.Transformer.from_crs(crs_from=crs_from, crs_to=crs_to)
bounds_4326 = proj.transform_bounds(*bounds)

print('bounds in EPSG 32719:', bounds)
print('bounds in EPSG 4326:', bounds_4326)

The output would be:

bounds in EPSG 32719: BoundingBox(left=399960.0, bottom=8590240.0, right=509760.0, top=8700040.0)
bounds in EPSG 4326: (-12.752446747147742, -69.92157939546101, -11.7580241630838, -68.91008576435853)

pyroSAR

With the Python library pyroSAR it is possible to process Synthetic Aperture Radar (SAR) satellite data, such as Sentinel-1, TerraSAR-X, TanDEM-X or ALOS-2/PALSAR-2. Methods that access SNAP and GAMMA Remote Sensing tools and functions developed specifically for Sentinel-1 are available for this purpose.

Processing vector data

Shapely

Shapely is a Python library for analyzing and manipulating planar geospatial features. The different geographic objects are represented by individual classes. A point is defined by the Point class, a curve by the LineString and LinearRing classes, and a surface by the Polygon class. A collection of objects are implemented by the MultiPoint, MultiLineString and MultiPolygon classes. These classes are the standard for processing and manipulating vector data and are also used by other software libraries and Python packages.

Shapely has a variety of built-in methods that determine both the geometric properties of an object such as area, boundary, and length, but also the relationships to other objects such as whether one object contains another, whether one object intersects with another, whether one object touches another, and much more. Shapely itself does not have functions to read vector data, for which other libraries such as python-geojson, fiona or geopandas (see below) are needed. However, the transformation of the geospatial features into shapely objects is simple.

In the code example below, a few of the standard methods presented above are demonstrated using features from a GeoJSON file:

import shapely.geometry
import geojson

file = 'natural_areas.geojson'
# reading GeoJSON file with geojson package (see below)
with open('natural_areas.geojson') as f:
    natural_areas = geojson.load(f)

natural_areas_shapely = []
# converting every polygon into a shapely polygon object
for polygon in natural_areas['features']:
    p = shapely.geometry.asShape(polygon['geometry'])
    # since the GeoJSON were downloaded from another source,
    # errors in the polygon can occur (e.g. it crosses itself)
    if p.is_valid:
      natural_areas_shapely.append(p)
    else:
      natural_areas_shapely.append(p.buffer(0))

print('geom_type:', natural_areas_shapely[1].geom_type)
print('area in km^2:', natural_areas_shapely[1].area)
print('bounds:', natural_areas_shapely[1].bounds)
print('object contains point:', natural_areas_shapely[1].contains(shapely.geometry.Point(-76.86, -6.81)))
print('object contains another object:', natural_areas_shapely[1].intersects(natural_areas_shapely[2]))

The output would be:

geom_type: Polygon
area in km^2: 0.010795427253500077
bounds: (-76.952331, -6.932597, -76.794716, -6.783479)
object contains point: True
object contains another object: False

Python-geojson

GeoJSON is a standard format to store geospatial features like points, lines or polygons and their coordinates. The Python package of the same name provides functions to read this data, process it and save it as GeoJSON. Because of its ease of use, geojson is a very useful tool when working with vector data that is available as GeoJSON.

The code example above demonstrates how to read a GeoJSON file with the geojson package.

Geojson.io

Geojson.io is a geobrowser that allows you to display geospatial features on a map. You can choose between maps from OpenStreetMap, Mapbox or satellite images from Maxar. The satellite images come with a high spatial resolution, but are often not quite up to date. Geojson.io offers also the possibility to upload and display GeoJSON files and to draw points, lines and polygons. The features can be saved as GeoJSON, Shapefile, KML, WKT and other formats. This makes geojson.io a very useful tool to create and visualize geographic vector data without any programming knowledge.


Geopandas

Geopandas is an extension of pandas, a standard library for data analysis and manipulation. The data types and methods known from pandas are extended by geographic and geometric objects and methods based on shapely. Geopandas also has functions to read and store vector data. These functions depend on fiona (see below).

In geopandas, shapely objects can be grouped in a GeoSeries, to which one can apply the various geometric methods but also classical operations for data analysis and manipulation. In addition, geopandas has practical tools for geocoding.

Fiona

With fiona, all raster and vector file formats supported by GDAL can be read and data can also be saved in this format. Beyond that fiona has a few basic functions for geometric analysis, e.g. the determination of boundaries of an object, and for coordinate transformations, which depend on shapely and pyproj.

Providing geospatial data

QGIS

QGIS is a free geographic information system for the desktop. It is a software that allows to display, edit, create, analyze and publish geographic information (raster and vector data).

Since QGIS is equipped with many functions and features, the full potential of the software is revealed mainly to people who work with it a lot. But the display of satellite data is easy to implement and helps a lot in the exploration of geospatial data.

GeoServer

GeoServer is a free server that can be used to publish geographic data. It uses the Open Geospatial Consortium standard protocols Web Map Service (WMS), Web Feature Service (WFS), Web Coverage Service (WCS), Web Processing Service (WPC) and Web Map Tile Service (WMTS).

Geospatial raster and vector data can be published using the above protocols and subsequently used in Web GIS applications (e.g. via Leaflet, Mapbox or OpenLayers) or in GIS software (e.g. QGIS or ArcGIS). This gives you the possibility to provide your geographic data interactively without long loading times and large data transfers.

Processing meteorological satellite data

Wetterdienst

Wetterdienst is a Python package that can be used to retrieve meteorological data from the German Meteorological Service (DWD). These include historical weather data, forecasts based on weather models and radar data (RADOLAN and RADVOR among others). The temporal resolution of the simple radar data (such as temperature, atmospheric precipitation, etc.) ranges from monthly to annual. RADOLAN data are provided with minute to daily temporal resolution.

Unfortunately, there are currently no built-in functions to create time series of RADOLAN data, which are commonly used for the analysis of weather data.

Wradlib

Wradlib is a Python library that can be used to process various weather radar data, including RADOLAN, DX and BUFR. Especially helpful is that the weather data can be converted to suitable raster formats, e.g. GeoTIFF.

Conclusion

Due to the large number of earth observation satellites, remote sensing methods find new application areas in many fields and provide valuable information about the earth's surface and its composition. dida has successfully demonstrated in several projects how remote sensing methods in combination with machine learning can provide new insights that are difficult to achieve by classical non-machine learning manual methods. However, to be able to process and analyze different satellite data, different tools are needed. Some of them, which we use regularly at dida, have been explained and demonstrated in this article. I hope this has provided a technical foundation to get started with your next remote sensing project.

About the author

Emilius Richter

During his studies of physics (FU Berlin) Emil developed his passion for machine learning, biophysics, and medical engineering. He worked as a software developer for an MRI research group, where he could apply his skills in deep learning to medical breathing scans. At dida he supports the Sales team in technical questions and in the acquisition of new ML projects.

Get quarterly AI news

Receive news about Machine Learning and news around dida.

Successfully signed up.

Valid email address required.

Email already signed up.

Something went wrong. Please try again.

By clicking "Sign up" you agree to our privacy policy.

dida Logo