Are LLMs Ready for COLREGs? A Study in Autonomous Maritime Vessel Navigation
Abstract
Autonomous Surface Vehicles (ASVs) operating in shared waterways must comply with the International Regulations for Preventing Collisions at Sea (COLREGs). While traditional rule-based systems often struggle with ambiguity and scalability, Large Language Models (LLMs) offer a promising alternative for flexible decision-making. This paper investigates whether LLMs are ready to support COLREG-compliant navigation by evaluating ChatGPT, DeepSeek, and Claude across five multi-vessel scenarios. To probe their reasoning capabilities, we vary prompt richness at three levels, minimal, moderate, and detailed, designed to simulate real-world variability in available information and communication. Each LLM model is assessed across ten trials per condition using three performance metrics namely collision rate, distance traveled, and compliance of actions as per COLREGs. Experimental results indicate that while all three LLM models exhibit limitations in producing COLREG-consistent decisions, ChatGPT and Claude generally avoid collisions, and DeepSeek is sensitive to prompt clarity. Our findings highlight both the potential and current limitations of LLMs in supporting safe, rule-based autonomous maritime navigation, and underscore the critical role of prompt engineering in maritime AI systems.
Department(s)
Computer Science
Document Type
Conference Proceeding
DOI
10.23919/OCEANS59106.2025.11245090
Keywords
ASV, COLREGs, LLM, prompt engineering
Publication Date
1-1-2025
Recommended Citation
Silwal, Arjun and Dubey, Rahul, "Are LLMs Ready for COLREGs? A Study in Autonomous Maritime Vessel Navigation" (2025). Faculty Scholarship. 193.
https://bearworks.missouristate.edu/articles00/193
Journal Title
Oceans Conference Record IEEE