Optimizing UAV Performance In Adverse Flying Conditions Using Reinforcement Learning

Authors

  • Jaivi Chandola Institute for Digital InnovAtion, George Mason University, Fairfax, VA
  • Akaash Sachdeva Institute for Digital InnovAtion, George Mason University, Fairfax, VA
  • James Ngo Institute for Digital InnovAtion, George Mason University, Fairfax, VA
  • Jacob Farmer Institute for Digital InnovAtion, George Mason University, Fairfax, VA
  • Saifullah Mahmood Institute for Digital InnovAtion, George Mason University, Fairfax, VA
  • Sehaj Gill Institute for Digital InnovAtion, George Mason University, Fairfax, VA
  • Afrah Nazeen Institute for Digital InnovAtion, George Mason University, Fairfax, VA
  • Kamaljeet Sanghera Institute for Digital InnovAtion, George Mason University, Fairfax, VA

DOI:

https://doi.org/10.13021/jssr2025.5311

Abstract

In recent years, reinforcement learning (RL) has emerged as a promising method in making UAVs fully autonomous, in which agents are put in simulated environments and improve their performance through trial and error. However, RL-trained UAVs are prone to issue when adapting to noncurated environments different from their training conditions. These limitations can undermine reliability and safety in critical scenarios, especially in real-world scenarios where conditions can change. Using the Pyflyt library to simulate the environment, sensor noise was introduced to emulate unfamiliar and harsh environments and observe the change of performance of the UAV. Flight trajectories, stability, and navigation success rates were analyzed across varying noise conditions to assess the model’s adaptability and response to novel conditions and train an RL model with these variables in mind. Preliminary results are indicative of restriction of contemporary models. At a baseline noise level of 0.1, the simulation yielded a mean reward of 6.497 with a standard deviation (σ) of 106.838; at 0.3, the mean dropped to -64.552 (σ = 119.743); and at 0.5, it fell further to -154.132 (σ = 137.479). The increasingly negative rewards and rising standard deviations indicate poor model guidance and growing instability in UAV flight paths. The simulation data stipulates the importance of testing learning algorithms in imperfect conditions and exhibits the faults in many current RL models when applied to unpredictable UAV settings. The experiment builds on the framework for developing RL models that maintain UAV simulation performance in more realistic scenarios, such as disaster relief and surveillance, by conditioning such projects in more disruptive, noise-intensive environments.

Published

2025-09-25

Issue

Section

Institute for Digital Innovation