Figure 4: Example image with three detections in the context of boat detection shown as bounding boxes. Two boats were correctly de-
tected (TP, green), and one was incorrectly detected (FP, red). Two boats were overlooked (FN, blue). The recall for this case is
therefore two out of four (50%) and the precision is two out of three (66%). (Koch et al., 2024)
Recall (r) as defined in ISO TS 4213 indicates how many objects should have been detected in one
classification. In Figure 4, there are four vessels. Because only two boats were identified by the model,
the recall metric is 50%.
FTP
Fr =—
TP+FN
Another important performance metric for object detectors is Intersection over Union (loU), which
compares the ground truth bonding boxes with the predicted bounding boxes. The result is a value
between 0 and 1 and can be set as a threshold in order to determine whether a prediction is a TP or a
EP,
= Area of Intersection
0UÜ=- ee
Area of Union
The mAP metric combines precision and recall by calculating the precision-recall curve and averaging
the precision values for different recall thresholds. It is used in maritime research and industry (Mes-
3a0ud et al., 2024)(Wang et al., 2024). This precision-recall curve shows how the two values depend
on each other. The mAP ultimately corresponds to the area under this curve: The Area Under Preci-
sion-Recall Curve (AUPRC) is one possible performance metric (cf ISO/IEC Technical Specification 4213)
(6.3.7) (ISO, 2022).
Semantic segmentation - “mean intersection over union” (mIoU)
Metrics such as loU are commonly used to measure the performance of a task in various use cases
such as segmentation, object detection, and object tracking. Different variations such as Probabilistic
Iintersection over Union (ProbloU), Kalman Filtering Intersection over Union (KFIoU), and Rotating In-
tersection over Union (RIoU) are discussed in the maritime research (Gao et al., 2024). In this study,
mMIoU is the metric used for semantic segmentation to evaluate the performance of a model (Huang et
al., 2024) (Koch et al., 2024).
in Prasad et al. (2016), it is discussed that the line features in a scene with moving vessels and the
absence of stationary cues may enable registration only if the vessels in the scene are not rotating.
Thus, for a general maritime scenario, registration of frames is still a challenge. Strictly speaking, the
best possible way of dealing with this scenario is the use of the motion and gyro sensors of the ship.