Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents

arXiv cs.CL·Xubo Lin, Zezhii Deng, Shihao Wang, Grace Hui Yang, Yang Deng

5/15/2026

·~1 min·5/15/2026·en·8

Quick Answer

The paper introduces Inquisitive Conversational Agents (ICAs) designed for U.S.

Quick Take

The paper introduces Inquisitive Conversational Agents (ICAs) designed for U.S. Supreme Court oral arguments, utilizing a Dual Hierarchical Reinforcement Learning framework with two cooperating agents. Evaluations demonstrate that this approach outperforms various baselines, marking a significant advancement in proactive dialogue systems for legal applications.

Key Points

Introduces Inquisitive Conversational Agents (ICAs) for proactive information extraction.
Utilizes a Dual Hierarchical Reinforcement Learning framework with two cooperating agents.
Emulates judicial questioning patterns to uncover crucial legal information.
Outperforms various baselines on a U.S. Supreme Court dataset across multiple metrics.
Represents a significant step toward high-stakes, domain-specific conversational applications.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Excerpt

From source RSS / original summary

arXiv:2605. 14057v1 Announce Type: new Abstract: Most existing dialogue systems are user-driven, primarily designed to fulfill user requests. However, in many critical real-world scenarios, a conversational agent must proactively extract information to achieve its own objectives rather than merely respond. To address this gap, we introduce \emph{Inquisitive Conversational Agents (ICAs)} and develop an ICA specifically tailored to U. S. Supreme Court oral arguments.

We propose a Dual Hierarchical Reinforcement Learning framework featuring two cooperating RL agents, each with its own policy, to coordinate strategic dialogue management and fine-grained utterance generation. By learning when and how to ask probing questions, the agent emulates judicial questioning patterns and systematically uncovers crucial information to fulfill its legal objectives. Evaluations on a U. S. Supreme Court dataset show that our method outperforms various baselines across multiple metrics.

It represents an important first step toward broader high-stakes, domain-specific applications.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Barak Or

1w ago

FeaturedOriginal

Quantifying Prior Dominance in Systems

AI Summary

The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.

#LLM #AI Coding #Inference #AI Startup

Dual Hierarchical Dialogue Policy Learning for Legal Inquisitive Conversational Agents

Quick Answer

Quick Take

Key Points

Paper Resources

Article Excerpt

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quick Answer

Quick Take

Key Points

Paper Resources

Article Excerpt

Want this in your inbox every morning?

More from arXiv cs.CL

Quantifying Prior Dominance in RAG Systems

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

When Plausible Is Not Realistic: Evaluating Human Mobility in LLM-Based Urban Simulation

Quantifying Prior Dominance in Systems