LatticeBridge: Rare-Event Sequential Inference for Faithful Structured Sequence Synthesis
Quick Answer
LatticeBridge introduces a novel approach to structured sequence generation, leveraging a compact prefix language model and a twisted sequential Monte Carlo decoder.
Quick Take
LatticeBridge introduces a novel approach to structured sequence generation, leveraging a compact prefix language model and a twisted sequential Monte Carlo decoder. It significantly outperforms traditional methods on 2,610 tasks from CommonGen, E2E NLG, and WikiBio, improving exact anchor satisfaction and mean anchor coverage.
Key Points
- LatticeBridge combines prefix language models and Monte Carlo decoding for better sequence generation.
- Improves exact anchor satisfaction and mean anchor coverage over traditional decoding methods.
- Evaluated on 2,610 tasks across CommonGen, E2E NLG, and WikiBio benchmarks.
- Reports metrics including required-anchor coverage and source-intrusion diagnostics.
- Focuses on rare-event sequential inference to meet multiple constraints in outputs.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 11203v1 Announce Type: new Abstract: Structured sequence generation often requires a model to satisfy several input-derived constraints in a single output. Standard decoding methods may assign high probability to fluent continuations while placing low mass on continuations that realize all required anchors jointly. We study this regime as a rare-event sequential inference problem.
LatticeBridge combines a compact prefix language model, instance-compiled surface automata, and a twisted sequential Monte Carlo (SMC) decoder with resampling, multilevel splitting, and a source-support proposal term derived from instance-provided phrases. The constraint representation is compiled from each input instance and does not rely on manually curated lexical classes.
On 2,610 attainable validation tasks spanning CommonGen, E2E NLG, and WikiBio, the particle decoder improves exact anchor satisfaction and mean anchor coverage over greedy, beam-filtered, and best-of-k ancestral baselines under a shared proposal model. Since exact anchor satisfaction alone does not rule out unsupported attribute substitutions, the evaluation reports required-anchor coverage, source coverage, source-intrusion diagnostics, overlap, runtime, and particle statistics jointly.
The benchmark characterizes the faithfulness-overlap-latency frontier under a fixed proposal model.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.