Using Reinforcement Learning to Herd a Robotic Swarm to a Target Distribution

Zahi Kakish, Karthik Elamvazhuthi, Spring Berman

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we present a reinforcement learning approach to designing a control policy for a “leader” agent that herds a swarm of “follower” agents, via repulsive interactions, as quickly as possible to a target probability distribution over a strongly connected graph. The leader control policy is a function of the swarm distribution, which evolves over time according to a mean-field model in the form of an ordinary difference equation. The dependence of the policy on agent populations at each graph vertex, rather than on individual agent activity, simplifies the observations required by the leader and enables the control strategy to scale with the number of agents. Two Temporal-Difference learning algorithms, SARSA and Q-Learning, are used to generate the leader control policy based on the follower agent distribution and the leader’s location on the graph. A simulation environment corresponding to a grid graph with 4 vertices was used to train and validate the control policies for follower agent populations ranging from 10 to 1000. Finally, the control policies trained on 100 simulated agents were used to successfully redistribute a physical swarm of 10 small robots to a target distribution among 4 spatial regions.

Original languageEnglish (US)
Title of host publicationDistributed Autonomous Robotic Systems - 15th International Symposium, 2022
EditorsFumitoshi Matsuno, Shun-ichi Azuma, Masahito Yamamoto
PublisherSpringer Nature
Pages401-414
Number of pages14
ISBN (Print)9783030927899
DOIs
StatePublished - 2022
Event15th International Symposium on Distributed Autonomous Robotic Systems, DARS 2021 and 4th International Symposium on Swarm Behavior and Bio-Inspired Robotics, SWARM 2021 - Virtual Online
Duration: Jun 1 2021Jun 4 2021

Publication series

NameSpringer Proceedings in Advanced Robotics
Volume22 SPAR
ISSN (Print)2511-1256
ISSN (Electronic)2511-1264

Conference

Conference15th International Symposium on Distributed Autonomous Robotic Systems, DARS 2021 and 4th International Symposium on Swarm Behavior and Bio-Inspired Robotics, SWARM 2021
CityVirtual Online
Period6/1/216/4/21

Keywords

  • Graph theory
  • Mean-field model
  • Reinforcement learning
  • Swarm robotics

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Mechanical Engineering
  • Engineering (miscellaneous)
  • Artificial Intelligence
  • Computer Science Applications
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Using Reinforcement Learning to Herd a Robotic Swarm to a Target Distribution'. Together they form a unique fingerprint.

Cite this