Evaluating Efficiency for Integrating Large Language, Small Language and Computer Vision Models into a Data Pipeline for an Autonomous Mobile Rover

Authors

  • Ishanth Thota George C. Marshall High School, Falls Church, VA
  • James Gallagher Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
  • Tyler Treat Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA
  • Edward Oughton Department of Geography and Geoinformation Science, George Mason University, Fairfax, VA

Abstract

An Application Programming Interface (API) is a script that connects various programs to communicate with one another through the use of endpoints. A few common APIs include: FastAPI, designed for developing high-performance, low-latency APIs; Flask, a lightweight and flexible microframework; and Bottle, an even smaller microframework without external dependencies.  However, it is crucial to understand how these APIs perform in the context of LLMs, SLMs, and CV models for developing high-efficiency APIs, crucial for mobile autonomous rovers as tested in this project. To create the data pipeline, a GUI was developed using Streamlit, serving as a base for the APIs, which implemented the POST method when called by the app, thereby returning either waypoints for the rover or obstacles from the rover’s camera (APIs were used to bridge an SLM, LLM, and two CV models to the app). We use Apache’s Jmeter to stress test the APIs, where 100 virtual users were introduced over 20 seconds to perform the test 10 times. For the SLM we find response times of 2228.5 ms for BottleAPI, 145591.9 ms for FastAPI, and 156639.9 ms for FlaskAPI for the SLM, while for the YOLO model, we found response times of 1533.6 ms for BottleAPI, 15861.6 ms for FastAPI, and 12359.7 ms for FlaskAPI. However, BottleAPI was unable to handle many requests in both cases. It was therefore concluded that FastAPI and FlaskAPI seemed to be the most efficient for collecting waypoints via language models, and obstacles via Computer Visions Models, respectively.

Published

2025-09-25

Issue

Section

College of Science: Department of Geography and Geoinformation Science