Multispectral RGB-LWIR Fusion with YOLO for Autonomous Object Detection in Low-Cost Mobile Robotics under Variable Lighting Conditions

Authors

  • Shreyas Jain Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
  • James Gallagher Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
  • Tyler Treat Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
  • Edward Oughton Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA

Abstract

Multispectral imaging (MSI), combining RGB and long-wave infrared (LWIR) data, has shown promise for improving object detection in challenging and dynamic lighting conditions, such as low-light or high-glare outdoor environments. While RGB-based computer vision is commonly used, it often fails in these situations, whereas thermal imaging, though robust in poor lighting, sacrifices fine detail. Despite this, there is a lack of practical lightweight systems for evaluating how fused MSI data performs in edge-computing scenarios, particularly on mobile robotics platforms. This project addresses that gap by assessing the effectiveness of RGB, LWIR, and fused imagery for human detection using YOLOv5, YOLOv8, and YOLOv11 architectures. A mecanum mobile robot equipped with RGB and LWIR cameras captures multispectral data under varying lighting conditions. Images are sent to a FastAPI web app to perform spatial image registration, fuse the pictures with the SeAFusion algorithm, and use lightweight custom YOLO models to return information on detected humans. With these outputs, the robot uses PID control and ultrasonic sensing to follow a person. We trained nine YOLO models (three architectures across three sensor modes) on a single human class and evaluated performance quantitatively through mAP and precision/recall metrics, as well as qualitatively through real-world tracking trials. Results indicate that multispectral fusion improves detection robustness, as the fused model achieved a 4.9% higher mAP@0.5 and 6.3% increase in precision compared to thermal-only inputs. This research shows the potential of low-cost MSI fusion and processing for real-time applications in robotics and edge AI.

Published

2025-09-25

Issue

Section

College of Science: Department of Geography and Geoinformation Science