Multi-Agent Reinforcement Learning with Selective State-Space Models

Jemma Daniel | Ruan John de Kock | Louay Ben Nessir | Sasha Abramowitz | Omayma Mahjoub | Wiem Khlifi | Juan Claude Formanek | Arnu Pretorius

Published

ABSTRACT

Transformer-based architectures have achieved strong performance in multi-agent reinforcement learning (MARL). A notable example is the Multi-Agent Transformer (MAT), which is able to achieve state-of-the-art performance in many cooperative tasks. However, MAT’s use of attention with quadratic complexity limits scalability to large agent populations. In contrast, State-Space Models (SSMs), such as Mamba, offer improved efficiency, but their potential in MARL remains unexplored. We introduce Multi-Agent Mamba (MAM), which replaces attention in MAT with causal, bidirectional, and cross-attentional Mamba blocks. Experiments show that MAM matches MAT’s performance while improving computational efficiency, suggesting SSMs can replace attention-based architectures in MARL for better scalability.1

1  All experimental data and code is available at: https://sites.google.com/view/multi-agent-mamba.