PEC-Home: Interpretation of Progressively Elliptical Commands in Smart Homes
Quick Answer
PEC-Home introduces a dataset for interpreting progressively elliptical commands in smart homes, addressing referential and intention ambiguities.
Quick Take
PEC-Home introduces a dataset for interpreting progressively elliptical commands in smart homes, addressing referential and intention ambiguities. Experiments with LLMs like GPT-4o reveal that current home assistants struggle with elliptical commands, achieving less than optimal execution accuracy compared to complete commands.
Key Points
- PEC-Home is the first dataset for progressively elliptical commands in smart homes.
- Elliptical commands lead to referential ambiguity among multiple users.
- User preferences cause intention ambiguity, complicating command interpretation.
- Experiments show LLMs like GPT-4o struggle with elliptical command execution.
- Execution accuracy remains below that of complete commands, even with dialogue history.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 18636v1 Announce Type: new Abstract: Recent advancements in Large Language Models (LLMs) have empowered home assistants with natural language interaction capabilities. However, current assistants overlook the progressive omission that occurs in human dialogue as shared context accumulates, leading to more elliptical expressions for efficient communication.
Thus, current assistants still struggle to interpret such elliptical expressions accurately, which limits their effectiveness in real-world applications. In practical smart home scenarios, assistants face two major challenges caused by elliptical commands: (1) referential ambiguity caused by different environmental expectations among multiple users; and (2) intention ambiguity resulting from user preferences that evolve over time or change with the environment.
To address these challenges, we introduce PEC-Home, the first simulated home dataset specifically designed for interpreting progressively elliptical commands in smart homes. Extensive experiments on various LLMs, including GPT-4o, show that existing home assistants struggle to execute user-intended operations based solely on elliptical commands. Even when equipped with tools for storing and retrieving user dialogue history, execution accuracy remains below that achieved with complete commands. }.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.