Reinforcement Learning Based Channel Hopping: Implementation on Nano Drones


  • Ziqiang Zhou Aspiring Scientists’ Summer Internship Program Intern
  • Dr. Kai Zeng Aspiring Scientists’ Summer Internship Program Primary Mentor



Today, unmanned aerial vehicles (UAVs)  are expanding in prevalence, and now play a preeminent role in numerous fields, finding use in forest fire monitoring, aerial photography, product deliveries, and more. However, due to the nature of wireless communication, the controller to UAV link is vulnerable to jamming attacks and other interference. To cope with such attacks, multi-armed bandit (MAB) algorithms have been proven effective. Nevertheless, there has been little research on the implementation of MAB algorithms on channel hopping for Nano drones. Thus, this study explores practical implementations of MAB algorithms, namely UCB1, EXP3, and Epsilon greedy on the Crazyflie 2.1 nano drones.  For the problem of opportunistic channel access with no prior information and assumption about the channel’s occupation and quality, MAB algorithms proved exceptionally effective. In a game where there are K independent arms (i.e., K wireless channels), MAB algorithms choose one arm to play (i.e., select one channel to access) each time and receive a reward. The algorithms balance exploration and exploitation to maximize the sum of rewards (i.e., the number of successfully delivered packets). To test the effectiveness of each MAB algorithm under jamming attacks, we conducted extensive experiments using Crazyradio and Crazyflie 2.1. The packet delivery ratios (PDRs) of a stationary jammer were measured for each algorithm to evaluate its effectiveness. In lab environments, our experiment showed that UCB1 performs the best with PDRs of 95.3% and EXP3 performs the worst with a PDRs of 82.8%.





College of Engineering and Computing: Department of Electrical and Computer Engineering