Synthetic Contrastive Reasoning for Multi-Table Q&A | AI Deep Signal

Synthetic Contrastive Reasoning for Multi-Table Q&A

arXiv cs.AI·Ankit Pratap Singh, Xin Su, Phillip Howard

6/6/2026

·~1 min·6/6/2026·en·3

Quick Answer

This paper shows that A synthetic contrastive reasoning-trace dataset for multi-table Q&A was developed, enhancing models like Qwen3-14B and Mistral-8B with Contrastive Preference Optimization (CPO).

Quick Take

CPO achieved performance gains of 9.7%-16.3% over traditional supervised fine-tuning, with up to 21 percentage points improvement on MMQA, demonstrating the effectiveness of heterogeneous trace generation.

Key Points

Synthetic dataset enhances multi-table Q&A with reasoning supervision.
CPO fine-tuning improved model performance by 9.7%-16.3% on average.
Up to 21 percentage points improvement observed on MMQA benchmarks.
Heterogeneous trace generators strengthen contrastive signals effectively.
Evaluations confirm generated pairs are coherent and meaningful.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Source Excerpt

arXiv:2606. 05382v1 Announce Type: new Abstract: Multi-table question answering requires models to retrieve relevant evidence, link schemas, and perform compositional reasoning across relational tables. Existing multi-table Q&A resources typically provide questions and final answers but lack reasoning supervision that explains how answers are derived.

To address this gap, we construct a synthetic contrastive reasoning-trace dataset for MMQA by generating validated positive traces and plausible negative traces with heterogeneous . We then use the resulting preference pairs to fine-tune open-weight LLMs with Contrastive Preference Optimization (CPO). …

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·Sumit Verma, Pritam Prasun, Pritish Kumar

1d ago

FeaturedOriginal

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

AI Summary

RAIL Guard introduces a closed-loop AI pipeline for large language models (LLMs) that evaluates outputs across eight dimensions and iteratively remediates failures, achieving 96.9% convergence compared to 49.1% for traditional block-and-retry methods. The system reduces unsafe agent executions by 33% without impacting task completion and is available as open-source SDKs.

#LLM #Agent #Open Source #Policy

Synthetic Contrastive Reasoning for Multi-Table Q&A

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Powered Agentic System

The Emerging Paradigm of Geospatial Foundation Models: From Pre-Training to Agentic Reasoning

Quick Answer

Quick Take

Key Points

Paper Resources

Source Excerpt

Want this in your inbox every morning?

More from arXiv cs.AI

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for LLM Agents

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Large Language Model Powered Agentic System

The Emerging Paradigm of Geospatial Foundation Models: From Pre-Training to Agentic Reasoning

RAIL Guard: Closing the Evaluation-to-Remediation Gap in Responsible AI for Agents

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Powered Agentic System