PhyDrawGen: Physically Grounded Diagram Generation from Natural Language

arXiv cs.AI·Nafiul Haque, Syed Nazmus Sakib, Shifat E Arman

6/1/2026

·~1 min·6/1/2026·en·2

Quick Answer

PhyDrawGen is a neuro-symbolic model that generates physics diagrams from natural language while adhering to physical laws.

Quick Take

PhyDrawGen is a neuro-symbolic model that generates physics diagrams from natural language while adhering to physical laws. It outperforms GPT-5-image, Gemini 2.5 Flash, and Gemini 3 Pro on a benchmark of 1,449 problems in mechanics, optics, and electromagnetism, demonstrating superior physical accuracy.

Key Points

Decouples semantic understanding from physical constraint satisfaction.
Uses a large language model to extract typed scene graphs.
Converts scene graphs into Planar Straight-Line Graphs (PSLG) for accurate geometry.
Implements a propose-verify loop with a fine-tuned Qwen-VL model.
Demonstrates robust performance on unusual-object problems.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Excerpt

From source RSS / original summary

arXiv:2605. 30512v1 Announce Type: new Abstract: Generating physics diagrams from text requires strict adherence to physical laws. While current generative models produce visually plausible outputs, they systematically hallucinate force vectors, ignore conservation laws, and violate geometric constraints. We present PhyDrawGen, a neuro-symbolic pipeline that decouples semantic scene understanding from physical constraint satisfaction. First, a large language model extracts a typed scene graph from the problem text.

A deterministic solver then converts this graph into a Planar Straight-Line Graph (PSLG), encoding force balance, optical paths, and field topologies as exact geometric primitives. Finally, a fine-tuned Qwen-VL model implements a visually grounded propose-verify loop to iteratively correct any constraint violations. Evaluated on a benchmark of 1,449 problems spanning mechanics, optics, and electromagnetism, PhyDrawGen significantly outperforms GPT-5-image, Gemini 2.

5 Flash, and Gemini 3 Pro, demonstrating robust physical accuracy even on unusual-object problems.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·David Krongauz, Arad Zulti, Eran Segal, Teddy Lazebnik

5h ago

FeaturedOriginal

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Large Language Model Powered Agentic System

AI Summary

The MEDA system utilizes large language models and symbolic regression to autonomously discover ordinary differential equations for biological systems, achieving strong structural recovery and biologically plausible models. It outperforms existing methods by integrating domain knowledge and mechanistic constraints, demonstrating effective retrieval and extrapolation capabilities.

#LLM #Agent #Inference #AI Startup