Computer Vision Case Study

Convective clouds detection

Hint: switch between general information and a more technical view on this project

Introduction

Our client here is Deutscher Wetterdienst (DWD). Part of their responsibility is to prepare weather reports for pilots in a system called METAR. We had been tasked with creating a model to assist the detection of convective clouds using satellite imagery to support other forms of detection.

Our client here is Deutscher Wetterdienst (DWD). Part of their responsibility is to prepare weather reports for pilots in a system called METAR. We had been tasked with creating a model to assist the detection of convective clouds using satellite imagery to support other forms of detection.

Starting Point

Cumulonimbus (CB) clouds cause significant downdrafts which are dangerous for aircraft taking off and landing. As such it is important for our client, Deutscher Wetterdienst (DWD), to reliably detect them as well as Towering Cumulus (TCU) which can evolve into Cumulonimbus clouds.

The data we use for this project is from a geostationary satellite called Meteosat Second Generation. This gives us an image of Germany from a fixed viewpoint every 5 or 15 minutes. Our goal will be to classify each pixel in the image into CB, TCU or neither.

Pilots receive a weather prediction called a METAR report describing conditions at the destination airport. Currently this is produced manually by observations from trained meteorologists on the ground but there is an effort by the DWD to automate as much as possible to free up the meteorologists’ time for other tasks. We will be focused on the section of this report that concerns convective clouds.

In order for an automated system to be reliable and trusted it must be multiply redundant and so models must be based on differing data sources. Algorithms using radar and lightning data already exist and our goal here was to provide an independent model, based on Meteosat Second Generation (MSG) data, to support these. As such the machine learning component of the system is a semantic segmentation of the satellite data into three classes: CB, TCU or neither.

Challenges

The most important component of any machine learning solution is a large and good quality training data set and this project was no exception. The task is extremely difficult even for humans to complete so labelling presents a challenge in itself.

Additionally the input data is in an unusual format that requires extensive preprocessing using domain specific knowledge before machine learning can be applied.

Labelling here was a significant challenge. The signs of convective clouds are extremely subtle and it is difficult for even a trained human to recognise them in the data.

In addition there were also a number of technical challenges regarding the input data. The satellite orbits above the equator and takes an image of the entire earth disk. This puts it at an angle of ~20° above normal. As a result the image must be geometrically transformed to give the correct perspective.

In addition we have 12 channels: 3 visual, which are all differing wavelengths of red rather than RGB, and 9 infrared channels. One of the visual channels, the High Resolution Visual (HRV) channel has three times the resolution of the others. This presents us with a choice, we can downsample the HRV channel or upsample the others. We chose to upsample the remaining channels to ensure that the fine structure in the HRV channel is preserved.

Solution

In order to accomplish this we manually labeled a dataset using some external radar data to help inform the decision making process. In addition our labels were reviewed by a trained meteorologist in order to ensure that they were correct.

A neural network model was trained using the satellite data as the input and our manual labels as the targets and its performance was measured using a hold back dataset that was not used at all during training. On this dataset we were able to obtain a 98% accuracy.

Technologies used: Python, PyTorch, SatPy, OpenCV, Numpy

The labelling process was conducted using a composite image of the HRV channel together with one of the infrared channels that made the relevant clouds stand out more. However even this is not sufficient to make a judgement in all cases so external data in the form of radar and ground observations was also consulted in order to label correctly. Additionally a trained meteorologist reviewed all labels to ensure that they were correct.

The model itself was a U-Net implemented in the PyTorch framework and trained using the Adam optimiser with a one cycle learning rate annealing scheduler.

Extensive data augmentation was also utilised to get the most from our dataset.

Product

Here are some example outputs from the test set with CB clouds labelled in blue and TCU in green. Both examples show an area of central europe focused on Germany.

Example 1: 15:00 28/06/2012. The Algorithm has picked out the large Cumulonimbus clouds in the center and south while avoiding mislabelling the large non convective region on the north east. It has also separated the Towering Cumulus clouds from the similarly coloured Alps nearby.

Example 2: 29/06/2012 Here the algorithm has correctly identified the convective clouds within the frontal system while avoiding mislabelling the entire region (A common failure case of pixel wise algorithms)

Here are some example outputs from the test set with CB clouds labelled in blue and TCU in green. Both examples show an area of central europe focused on Germany.

Example 1: 15:00 28/06/2012. The Algorithm has picked out the large Cumulonimbus clouds in the center and south while avoiding mislabelling the large non convective region on the north east. It has also separated the Towering Cumulus clouds from the similarly coloured Alps nearby.

Example 2: 29/06/2012 Here the algorithm has correctly identified the convective clouds within the frontal system while avoiding mislabelling the entire region (A common failure case of pixel wise algorithms)

Case Studies in Computer Vision

Computer Vision

Monitoring urban growth and change

An image segmentation algorithm that supports sustainable city planning.
Our solution
Computer Vision

Automatic planning of solar systems

Creative solutions enabled us to automate the process of planning solar systems.
Our solution