/

Precision and Recall

Understanding AI for maritime data applications

In the previous section, we mentioned that the evaluation phase is scored. This F1 score, as it’s called, is based on Precision and Recall. 

Let's use a ML model to detect fishing to help understand these terms.

Precision - How accurate the model is at identifying actual fishing behavior based on the amount of “True Positives” and “False Positives." In this sense, “Positives” means the model thinks it has found what it's looking for – A positive detection. However, some of these detections are correct (true), others are not (false). 

True positives - The model provided a correct output. 

  • For example: A human confirms, “Yes, this does look like fishing.”

False positives - The model provided a wrong output. 

  • For example a model may detect fishing, but a human knows better, “No, it looks like the vessel is slowing down to enter port, it’s not fishing”

High precision indicates few false positives. Low precision has more false positives among true positives.

Recall - How well the model finds all “true positives” (e.g., vessels fishing during a certain period;) compared to False Negatives

True Negative - The model correctly did not detect anything

  • For example - The model did not detect fishing behavior for a vessel transiting 

False Negative - The model missed an instance

  • For example - A vessel was fishing, but the model did not find it.

High recall indicates all or most instances are identified by a model. Low recall indicates many instances are missed (false negatives).

 

Let’s use another example of using machine learning to detect vessels from an image to consider precision and recall 

Precision - How accurate the model is at finding vessels in an image

True positives - The model provided a correct output. 

  • For example: A human confirms, “Yes, this does look like a vessel.”

False positives - The model provided a wrong output. 

  • For example: the model detects a vessel, but human knows better, “No, that’s a big wave, not a vessel”

Recall - How well the model captures all “true positives” (e.g., vessels within an image) compared to False Negatives

True negatives - The model correctly did not detect anything

  • For example: No vessel was visible in the image and the model correctly did not think it found one.

False negatives - The number of missed instances. 

  • For example: the model did not detect any vessels in the image, but in fact, at least one is visible

 

Ideally, a model has exceptionally high recall and precision, but this is not always possible, especially with imperfect data like AIS or satellite imagery. Skylight strives to ensure a high level of precision and recall, but some fine tuning of recall and precision is always required. This is in part due to the data, namely the AIS signal (gaps in transmission), vessel movementsthat resemble other types of behaviors (dredging and trawling), image resolution, weather and several other factors.

For  maritime surveillance, Skylight tends to favor high recall. In other words, we want to make sure to show you possible instances of a particular vessels behavior (higher recall) even if we make a few mistakes (lower precision).