REPORT OF THE MARKET SURVEY OF MARITIME ELECTRO-OPTICAL SENSORS AND AI ASSISTANCE FOR NAUTICALCREW

Schmid, Helmut; Portier, Martin; Herrmann, Hans

Input image 
Seaementation mask 
Obiect detection 
Figure 3: Starting from maritime images (under “Input image”) a distinction is made between semantic segmentation (under “Segmenta- 
tion Mask”) and object detection (under “Object detection”) (Chen et al., 2021) 
object detection and semantic segmentation (Figure 3). Both tasks represent an active research area 
in computer vision. Whilst object detection is used to locate and classify objects within an image with 
the help of a bounding box, semantic segmentation is used to divide an image into multiple regions on 
pixel by pixel basis. These tasks have been dominated by Machine Learning for several years now. Two 
methods — or more specifically, two classes of architectures - have emerged: Convolutional Neural 
Networks (CNN) and Vision Transformer (Koch et al., 2024). 
Performance metric 
The ISO 16273 standard currently defines the required accuracy of a night vision system with a human 
abserver being able to detect a specified target nine out of 10 times under voyage conditions. Perfor- 
mance metrics are used to evaluate how well an object detection model performs. Metrics commonly 
used to compare the performance of different object detection models include “Mean Average Preci- 
sion” (mAP) and “Mean Intersection over Union” (mloU). These are explained in the following section. 
Object detection - ‘Mean Average Precision” (mAP) 
in order to estimate the mAP, all detections must be classified into three categories: 
True positive (TP): Correctly detected objects 
False positive (FP): Incorrectly indicated detections (i.e. cases in which the model detects ob- 
jects that actually are not present or misclassifies objects) 
False negative (FN): Omitted objects (i.e. objects that were not recognised by the model even 
though they actually exist). 
These three cases are illustrated in Figure 4. Precision defined in ISO TS 4213 indicates how often the 
model predicts an object correctly. In Figure 4, three objects have been detected, but only two were 
correctly identified as vessel; this results in a precision metric (p) of 66% for this model. 
T') 
D — 
TP + FP

1
2
3
4
5
6
...
11
12

Digitale Bibliothek im BSH

Full text: REPORT OF THE MARKET SURVEY OF MARITIME ELECTRO-OPTICAL SENSORS AND AI ASSISTANCE FOR NAUTICALCREW

Access restriction

Copyright

Behördennummer

Note to user