Reinforcement learning for guiding swarm of robots using python
10/15, 13:50–14:20 (Asia/Tokyo), pyconjp_3
言語: English

Swarm intelligence is applied in robotics for guiding robots in unstructured environments. Applications where swarm robots are required to maintain a particular formation ranges from fire fighting to underwater exploration. We will see how the swarm robots can learn to acquire a target formation using reinforcement learning algorithm implemented with python using scipy,matplotlib,numpy framework.


Description:

Swarm intelligence in robotics has been inspired from the interaction of social insects, mammals that occur naturally in the environment. I have been actively involved in research in reinforcement learning for multi-agent systems and I am thrilled to explain how learning can be used to guide swarm of robots using python and how visualization using matplotlib is used to understand the robustness of the learning framework.

Program:

• Self introduction (1 minute)

• Motivation (1 minute): Reinforcement learning has been widely used for behavior generation in robotics and provides appealing approach for robots to learn new tasks. The probabilistic and distributed control of large scale swarms where the swarms adapt the formation of a static shape is an important aspect that is sought in many applications including extinguishing fire with drones.

• Introduction to Reinforcement learning (4 minutes)

        o   Model Based Reinforcement learning (2 minutes)

        o   Model Free Reinforcement learning (2 minutes)

• Distributed control algorithm for swarm(2 minutes)

        o   Lagrangian framework (1 minute)
        o   Eulerian framework (1 minute)

• Solution Approach( 11 minutes)

        o   Computing Hellinger distance using python  (show the code) (1 minute)
        o   Generate the transition matrix using the closed form solution with numpy(1 minute)
        o   Static framework without local movements by computing cost function using Hellinger distance (2 minutes)
        o   Options framework (2 minutes)
        o   Local restriction to agent movements (2 minutes)
        o   Readjustments of the swarm configuration when the number of agents reduces (3 minutes)

• Result Presentation (5 minutes)

        o   Visualization and processing of target distribution from image using PIL (2 minutes)
        o   Error plot between the target and the acquired distribution using matplotlib (1 minute)
        o   Real-time visualization of automatic agent reconfiguration to take the target shape when the number of agents reduces by half (code) (2 minutes)

• Summary (2 minutes)

Raihan Seraj is a PhD student at McGill University. He is affiliated with Mila- Quebec Artificial Institute. His research focuses on Reinforcement learning for multi-agent system and designing learning algorithms for human in the loop systems. He completed his Masters in Electrical and Computer Engineering from McGill University. Besides being an avid tech enthusiast, Raihan enjoys trying out different cuisines.