How Shippy Is Evaluated
Shippy is regularly tested against a set of quality standards to make sure it gives you accurate, useful, and trustworthy answers. Each standard focuses on a different aspect of Shippy's behavior. Here's what we measure:
Response Quality
- Data Accuracy — Evaluates whether Shippy provided the right answer (e.g. filtering events on the map provides the same answer as the chat).
- Response Consistency — Checks that rephrasing a question doesn't change the answer Shippy gives back.
- Time Awareness — Confirms Shippy states the exact dates and time zone it used, especially when you say things like "last week."
- Geographic Boundaries — Verifies Shippy uses the correct official boundaries (EEZs, MPAs, etc.) and clarifies any ambiguity with you first.
How Shippy Communicates
- Tone and Response Style — Checks that responses are professional, well-organized, and easy to read.
- Pre-Task Planning — Confirms Shippy explains what it's about to do before pulling data on complex queries.
- Credits and Sources — Verifies Shippy properly attributes data from external partners like Global Fishing Watch.
Shippy's Boundaries
- Domain Relevance — Ensures Shippy stays focused on maritime topics and politely redirects off-topic requests.
- Judgment — Confirms Shippy presents facts and analysis without making legal determinations or speculating about intent.
- Military Use — Verifies Shippy declines defense-related requests and stays aligned with Skylight's conservation mission.
Data Protection
- AIS Data Privacy — Checks that Shippy provides insights and summaries rather than exposing raw AIS position data.
- Multi-User Privacy — Ensures Shippy never shares information about other users' queries or watchlists across conversations.
Operational Support
- Patrol Planning Support — Evaluates whether Shippy gathers the right inputs (vessel activity, legal context, logistics) to help you prepare a patrol plan.
- Interactive Map Links — Confirms Shippy includes links that let you jump directly into the Skylight map to see the data visually.
- Documentation Queries — Checks that Shippy points you to accurate documentation and step-by-step guidance for using Skylight.
If you see think there are criteria that should be added to this list, we want to hear from you. Contact one of the Skylight team members or support@skylight.global.
Was this article helpful?
