Scaling Accessible Mathematics on arXiv: HTML Conversion and MathML 4
Quick Take
arXiv enhances HTML Papers with improved fidelity, MathML 4, and Rust porting for efficiency.
Key Points
- Community-driven HTML fidelity improvements resolving 50% of user reports.
- Corpus-scale conversion aims for 90% error-free HTML.
- Initial MathML 4 annotations for accessible speech output.
📖 Reader Mode
~2 min readAbstract:We report on the ongoing development of arXiv's HTML Papers offering, available on every new TeX/LaTeX submission since its initial release in 2023.
The main highlights from 2025 and early 2026 are:
(i) community-driven improvements to HTML fidelity and service health, with roughly half of 6,000 user reports resolved;
(ii) corpus-scale conversion work aimed at 90% error-free HTML (currently 75%);
(iii) initial MathML 4 Intent annotations for accessible speech output;
(iv) an in-progress Rust port of LaTeXML, reducing compute costs and enabling faster previews on submission.
The arXiv HTML Papers project remains experimental, but is gradually maturing as we better understand the needs of arXiv's readers and the technical opportunities presented by new standards and by advances in programming languages and AI.
| Comments: | 6 pages, ICMS 2026 |
| Subjects: | Computation and Language (cs.CL); Digital Libraries (cs.DL) |
| MSC classes: | 68U15 (Primary) 68V25, 68U35 (Secondary) |
| ACM classes: | I.7.2; H.3.7 |
| Cite as: | arXiv:2605.16562 [cs.CL] |
| (or arXiv:2605.16562v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2605.16562 arXiv-issued DOI via DataCite (pending registration) |
Submission history
From: Deyan Ginev [view email]
[v1]
Fri, 15 May 2026 19:04:45 UTC (25 KB)
— Originally published at arxiv.org
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The reliability of LLM judges for evaluating deep research agents is critically assessed using the REFLECT benchmark.