In the industrial production sector, it is essential to have an efficient quality control system that detects and assesses damages during the production process to avoid defect outputs and production line outages. Since the damage assessment is a tiring and time-consuming work for humans, the automation of this control system can support a more efficient production process.
The inspection of production components requires the processing of image and video data. This can be done by using a machine learning/ Computer Vision model, that detects and identifies damages in real time by recognizing patterns and objects in the imagery data.
However, the training of such an algorithm requires a large amount of training data to cover all possible types of damages and defects, which may not be existent. Moreover, the training data should ideally stem from a similar distribution of the overall distribution.
Traditional Computer Vision approaches such as thresholding are limited in its performance and transferability for new production lines, but are relatively transparent in their underlying functions. machine learning approaches might be able to achieve near human level performance and can be transferred to similar outputs, e.g. with transfer learning approaches. However, deep neural networks are "black boxes" and its results are less transparent than for traditional Computer Vision approaches.
"Detecting and identifying damages and defects in image and video data can be solved by implementing an image segmentation and classification model. For such Computer Vision tasks special Convolutional Neural Networks (CNN) are used, such as U-Net or Mask R-CNN algorithms. For these approaches, a labeled training data set is necessary. To label images, tools dedicated to Computer Vision applications such as Labelbox, CVAT or VGG Image Annotator (open source) might be used.
A Generative Adversarial Network (GAN) - in addition to the CNN - can reduce the required amount of training data and facilitates the transferability to other similar problems, especially when the number of outputs or the availability of training data is very limited."