A ROS2-based autonomous UAV perception stack — detection, tracking, identity, and pose running concurrently on a Jetson Orin Nano with an 11ms camera-to-setpoint latency budget.
~11ms
end-to-end latency
62fps
sustained throughput
5
models concurrent
Challenge — A multi-model perception stack (detection, tracking, identity, pose) had to drive a drone's flight controller within a strict 16.7ms-per-frame budget on a single Jetson Orin Nano.
Solution — Profiled and tuned a five-model ROS2 pipeline, swapped SORT in for DeepSORT after measuring actual cost, and deployed through both ONNX Runtime and TensorRT FP16 with no changes to calling nodes.
A MobileNetV2-SSD object detector built from scratch in TensorFlow, including the data pipeline, training loop, and spot-instance training infrastructure, deployed for real-time inference on Jetson edge hardware.
77%
mAP@0.5 on VOC
60+
FPS on Jetson
~8x
parameter reduction vs standard convs
Challenge — Real-time object detection (60 FPS+) was required on a Jetson Orin Nano, ruling out two-stage detectors and transformer-based architectures that are too slow or too memory-hungry for the edge.
Solution — Implemented a single-stage MobileNetV2-SSD detector with a custom data pipeline, IoU-based target assignment, mixed-precision training, and FP16/quantized deployment, trained on AWS spot instances with automatic recovery from preemption.
A from-scratch TensorFlow implementation of Faster R-CNN, covering the shared backbone, region proposal network, RoI pooling, and the joint classification and regression head.
Two-stage
detector
Shared
backbone features
RPN
+ RoI pooling
Challenge — Region proposal and classification are typically treated as separate problems, doubling computation and making proposal quality dependent on a separate, unshared feature extractor.
Solution — Unified proposal generation and detection into a single trainable system by sharing convolutional features between the Region Proposal Network and the classification/regression head.