Building Our Autonomy Stack
How we built our computer vision pipeline, state machine, and PID controllers for autonomous underwater navigation.
ROS 2 node graph showing the autonomy stack architecture
Architecture
Our autonomy stack is built on ROS 2 Humble and runs on an NVIDIA Jetson. The main components are:
- Perception — YOLOv8-based object detection for gate, buoys, and bins
- Localization — Extended Kalman Filter fusing IMU, depth, and DVL data
- Planning — Behavior tree-based mission planner
- Control — Cascaded PID controllers for 6-DOF motion
YOLOv8 detection output showing gate and buoy detection underwater
Computer Vision
We trained custom YOLOv8 models on our own dataset of 5,000+ annotated underwater images. Detection runs at 30 FPS on the Jetson with TensorRT optimization.
Gazebo simulation environment with the vehicle navigating through a gate
Simulation
Before any pool time, we validate everything in a Gazebo simulation environment with realistic underwater physics and sensor noise models.
Results
In simulation, our vehicle can reliably: - Navigate through the gate (95% success rate) - Hit the correct buoy (88% success rate) - Complete the bins task (75% success rate)