R2D-RL: A RoboCup 2D Soccer Environment for Multi-Agent… | AI Deep Signal

R2D-RL: A RoboCup 2D Soccer Environment for Multi-Agent Reinforcement Learning

arXiv cs.AI·Haobin Qin, Baofeng Zhang, Hidehisa Akiyama, Keisuke Fujii

6/18/2026

·~2 min·6/18/2026·en·2

Quick Answer

R2D-RL is a new reinforcement learning environment that bridges RoboCup 2D Soccer Simulation with Python-based MARL workflows, enabling advanced multi-agent training.

Quick Take

It features configurable opponents, hybrid action spaces, and supports parallel execution, providing benchmarks for 11-vs-11 scenarios and front-goal challenges.

Key Points

R2D-RL connects RCSS2D and HELIOS clients via shared-memory communication.
Supports full-field and scenario-based training with configurable opponents.
Includes hybrid parameterized action spaces and action masks for enhanced control.
Offers expected possession value (EPV)-based reward shaping for improved learning.
Provides benchmarks for 11-vs-11 matches and front-goal scenarios.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Source Excerpt

Robot soccer is a challenging testbed for reinforcement learning because it combines partial observability, cooperative and adversarial interaction, sparse rewards, and long-horizon tactical behavior. RoboCup 2D Soccer Simulation (RCSS2D) provides a mature robot-soccer platform, but its competition-oriented server-client architecture is difficult to use directly with modern Python-based MARL workflows. We introduce R2D-RL, a reinforcement learning environment that connects RCSS2D and

Read the full article on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Ji Wu, Yunshan Peng, Wentao Bai, Yunke Bai, Wenzheng Shu, Jinan Pang, Yanxiang Zeng, Xialong Liu

4d ago

FeaturedOriginal

HOBA: Hierarchical On-Policy Bidding Agents for Adaptive Online Advertising

AI Summary

HOBA (Hierarchical On-policy Bidding Agents) is a novel hierarchical reinforcement learning framework that enhances online advertising bidding systems by improving adaptability and reducing hyperparameter tuning costs. It utilizes a for hyperparameter inference, a SARSA agent for expert model selection, and a dynamic expert pool for bid execution, achieving a +3.6% increase in target cost during large-scale deployment and outperforming state-of-the-art baselines on AuctionNet.

#LLM #Agent #Inference #AI Startup

R2D-RL: A RoboCup 2D Soccer Environment for Multi-Agent Reinforcement Learning

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

HOBA: Hierarchical On-Policy Bidding Agents for Adaptive Online Advertising

AINTMA: Agentic AI Architecture for Autonomous Test Management with Generative Intelligence, Secure Cloud Communication and Adaptive Quality Analytics

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

HOBA: Hierarchical On-Policy Bidding Agents for Adaptive Online Advertising

AINTMA: Agentic AI Architecture for Autonomous Test Management with Generative Intelligence, Secure Cloud Communication and Adaptive Quality Analytics

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for LLM Agents

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents