Precision and Recall
Understanding AI for maritime data applications
In the previous section, we mentioned that the evaluation phase is scored. This F1 score, as it’s called, is based on Precision and Recall.
Let's use a ML model to detect fishing to help understand these terms.
Precision - How accurate the model is at identifying actual fishing behavior based on the amount of “True Positives” and “False Positives." In this sense, “Positives” means the model thinks it has found what it's looking for – A positive detection. However, some of these detections are correct (true), others are not (false).
True positives - The model provided a correct output.
- For example: A human confirms, “Yes, this does look like fishing.”
False positives - The model provided a wrong output.
- For example a model may detect fishing, but a human knows better, “No, it looks like the vessel is slowing down to enter port, it’s not fishing”
High precision indicates few false positives. Low precision has more false positives among true positives.
Recall - How well the model finds all “true positives” (e.g., vessels fishing during a certain period;) compared to False Negatives
True Negative - The model correctly did not detect anything
- For example - The model did not detect fishing behavior for a vessel transiting
False Negative - The model missed an instance
- For example - A vessel was fishing, but the model did not find it.
High recall indicates all or most instances are identified by a model. Low recall indicates many instances are missed (false negatives).
Let’s use another example of using machine learning to detect vessels from an image to consider precision and recall
Precision - How accurate the model is at finding vessels in an image
True positives - The model provided a correct output.
- For example: A human confirms, “Yes, this does look like a vessel.”
False positives - The model provided a wrong output.
- For example: the model detects a vessel, but human knows better, “No, that’s a big wave, not a vessel”
Recall - How well the model captures all “true positives” (e.g., vessels within an image) compared to False Negatives
True negatives - The model correctly did not detect anything
- For example: No vessel was visible in the image and the model correctly did not think it found one.
False negatives - The number of missed instances.
- For example: the model did not detect any vessels in the image, but in fact, at least one is visible
Ideally, a model has exceptionally high recall and precision, but this is not always possible, especially with imperfect data like AIS or satellite imagery. Skylight strives to ensure a high level of precision and recall, but some fine tuning of recall and precision is always required. This is in part due to the data, namely the AIS signal (gaps in transmission), vessel movementsthat resemble other types of behaviors (dredging and trawling), image resolution, weather and several other factors.
For maritime surveillance, Skylight tends to favor high recall. In other words, we want to make sure to show you possible instances of a particular vessels behavior (higher recall) even if we make a few mistakes (lower precision).
Was this article helpful?