VeriGeo: Controllable Geometry Question Generation with Numerical and Analytical Verification
Quick Answer
VeriGeo is a novel geometry problem generation framework that enhances controllability and reliability through executable reasoning traces.
Quick Take
VeriGeo is a novel geometry problem generation framework that enhances controllability and reliability through executable reasoning traces. It utilizes a three-stage verification pipeline to ensure numerical and analytical consistency, achieving top GeoQA performance among LLM-based solvers with supervised fine-tuning on 8.7k examples. This approach significantly improves multimodal geometry reasoning and repairs a substantial fraction of invalid attempts across five LLM backbones.
Key Points
- VeriGeo generates geometry problems with user-defined constraints and verifiable solutions.
- The framework includes a three-stage pipeline for checking consistency and repairing failures.
- Supervised fine-tuning on 8.7k examples leads to top performance on GeoQA benchmarks.
- VeriGeo repairs a significant portion of invalid problem generations across multiple LLMs.
- It enhances AI-assisted education by improving multimodal mathematical reasoning.
Paper Resources
Article Content
From source RSS / original summaryarXiv:2606. 14176v1 Announce Type: new Abstract: Geometry problem generation is useful for AI-assisted education and multimodal mathematical reasoning, but reliable synthesis remains difficult because the problem statement, diagram, constraints, and solution should be mutually consistent.
Existing methods often trade off controllability and reliability: seed-based rewriting is flexible but weakly verifiable, whereas diagram-first construction improves validity but is less suited to arbitrary user-specified constraints. We introduce VeriGeo, a controllable geometry generation framework grounded in executable reasoning traces. Given user constraints such as target concepts and difficulty, an Author agent generates a problem and diagram, and a Solver agent produces a proof-aligned solution.
Both agents use a shared action sequence that connects natural language, diagrams, geometric constraints, and proof steps into a verifiable representation. A three-stage pipeline checks numerical consistency, analytical realizability, and global consistency, using verification-guided reflection to repair recoverable failures and reject unrecoverable ones. Across five LLM backbones, raw generations frequently fail these checks, while VeriGeo repairs a substantial fraction of the invalid attempts.
Supervised fine-tuning on 8. 7k examples generated by VeriGeo achieves the best reported GeoQA performance among end-to-end multimodal LLM-based solvers, and obtains strong results on PGPS9K and MathVista-GPS, demonstrating the effectiveness of verified synthetic data for improving multimodal geometry reasoning.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.AI
See more →Arbor: Tree Search as a Cognition Layer for Autonomous Agents
Arbor introduces a multi-agent framework utilizing structured tree search for optimizing LLM inference, achieving up to 193% throughput-latency improvement compared to vendor-optimized systems. It employs an Orchestrator and Critic agent for stability and coordination, demonstrating hardware-agnostic performance with minimal variance.