Auto-FL-Research: Agentic Search for Federated Learning Algorithms
Quick Answer
This paper shows that Auto-FL-Research (AFR) introduces a coding-agent workflow for optimizing federated learning algorithms, demonstrating performance improvements on four FLamby tasks and five LEAF profiles.
Quick Take
Auto-FL-Research (AFR) introduces a coding-agent workflow for optimizing federated learning algorithms, demonstrating performance improvements on four FLamby tasks and five LEAF profiles. The study reveals the impact of algorithmic choices on training outcomes and highlights the importance of distinguishing between FL mechanisms and tuning effects.
Key Points
- AFR allows agents to propose and implement various federated learning algorithms.
- Performance gains were observed in four FLamby tasks and five LEAF profiles.
- The study identifies seed-sensitive and search-selected failure cases.
- Mixed outcomes indicate the need for careful evaluation of algorithmic changes.
- AFR records detailed metrics including candidate scores and runtime.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2607. 01366v1 Announce Type: new Abstract: Federated learning (FL) research often depends on many small but consequential algorithmic choices: optimizer variants, server aggregation rules, local training schedules, normalization, regularization, and model architecture. These choices are expensive to explore manually and difficult to compare fairly when candidate changes can also alter the FL training or evaluation path.
In this work, we present Auto-FL-Research (AFR), a constrained coding-agent workflow for FL algorithmic recipe search. Agents may propose and implement candidate training algorithms, including server aggregation rules, client update schedules, local objectives, and registered model variants, while task profiles fix the mutation surface, compute budget, communication contract, and final model evaluation. Each campaign records candidate scores, runtime, edited files, artifacts, and failure status.
We evaluate AFR on five healthcare cross-silo FLamby tasks and on grouped-client profiles for the five fixed LEAF datasets plus the LEAF synthetic task. Five-seed repeat evaluations support gains on four FLamby tasks and five of six LEAF profiles, while also exposing seed-sensitive and search-selected failure cases. Same-budget controls show that several gains correspond to FL-recipe changes, whereas other improvements are recovered by fixed-surface scalar controls or fail under repeat or held-out evaluation.
These mixed outcomes are part of the contribution: they show how agent-generated candidates can be separated into repeated FL mechanisms, fixed-surface tuning effects, and selected single-run artifacts.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Procedural Memory Distillation: Online Reflection for Self-Improving Language Models
Procedural Memory Distillation (PMD) enhances reinforcement learning by converting cross-episode signals into reusable memory, improving Qwen3-8B and OLMo3-Instruct-7B models by 3.8-5.5% on SCIKNOWEVAL and 7.9-13.6% on . The co-evolution of policy and memory allows for more effective self-supervision, demonstrating significant performance gains when both components are active.