ProfiLLM: Utility-Aligned Agentic User Profiling for Industrial Ride-Hailing Dispatch
Quick Answer
ProfiLLM enhances industrial ride-hailing dispatch by utilizing LLMs for user profiling, achieving up to 6.14% AUC improvement and 4.35% GMV gain in simulations.
Quick Take
ProfiLLM enhances industrial ride-hailing dispatch by utilizing LLMs for user profiling, achieving up to 6.14% AUC improvement and 4.35% GMV gain in simulations. Deployed on DiDi's platform, it addresses challenges of user data sparsity and context limitations through innovative profiling techniques.
Key Points
- ProfiLLM uses LLMs to create adaptive user profiles for ride-hailing dispatch.
- Achieved a 6.14% relative AUC improvement in outcome prediction.
- Generated a 4.35% GMV gain in dispatching simulations.
- Implemented on DiDi's production dispatcher with positive A/B test results.
- Addresses user data sparsity and context limitations in profiling.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 18803v1 Announce Type: new Abstract: Bringing Large Language Models (LLMs) into industrial ride-hailing dispatch as semantic feature extractors over platform-scale behavioral logs is a compelling but under-explored data systems problem. Production matching pipelines remain dominated by structured numerical features, yet decisive behavioral signals (e. g. , a driver's habitual aversion to certain regions) are inherently contextual and naturally expressible as LLM-generated user profiles.
However, scaling such profiling to a live, millisecond-latency dispatcher faces three intertwined constraints rarely addressed together: on a platform with millions of daily orders, logs exceed any LLM's context window by orders of magnitude; most users are long-tail, with too few interactions for per-user profiling; and surface-fluent profiles do not necessarily improve downstream prediction utility.
We present ProfiLLM, an agentic LLM data pipeline that operationalizes utility-aligned user profiling for production matching systems through two modules. (1) Tool-Augmented Global Knowledge Mining equips an LLM agent with 27 analytical tools to mine platform-scale data, producing reusable global knowledge, adaptive user clustering rules, and region-level supply-demand priors.
(2) Utility-Aligned Profile Exploration generates multiple candidate profiles per cluster, evaluates them via a lightweight downstream utility proxy, iteratively refines the best candidates and constructs preference pairs for DPO fine-tuning. Deployed on DiDi's production dispatcher, ProfiLLM achieves up to +6. 14% relative AUC improvement in outcome prediction, up to +4. 35% GMV gain in dispatching simulation, and consistent improvements in a 14-day online A/B test including +0. 47% GMV, +0.
33% Completion Rate, and -0. 82% Cancel-Before-Accept rate.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Arbor: Tree Search as a Cognition Layer for Autonomous Agents
Arbor introduces a multi-agent framework utilizing structured tree search for optimizing LLM inference, achieving up to 193% throughput-latency improvement compared to vendor-optimized systems. It employs an Orchestrator and Critic agent for stability and coordination, demonstrating hardware-agnostic performance with minimal variance.