RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases

6/3/2026

·~1 min·6/3/2026·en·2

Quick Answer

This paper shows that RelGT-AC, a Relational Graph Transformer, enhances autocomplete tasks in relational databases by introducing a column masking strategy, a unified task head, and a TF-IDF encoder.

Quick Take

RelGT-AC, a Relational Graph Transformer, enhances autocomplete tasks in relational databases by introducing a column masking strategy, a unified task head, and a TF-IDF encoder. It outperforms GraphSAGE on all regression tasks and achieves up to +10 AUROC points on text-heavy tasks across three RelBench v2 datasets.

Key Points

Introduces column masking to avoid trivial solutions during subgraph encoding.
Supports binary classification, multiclass classification, and regression in one model.
TF-IDF encoder recovers strong lexical signals from free-text columns.
Outperforms GraphSAGE on all regression autocomplete tasks.
Achieves up to +10 AUROC points on text-heavy eligibility tasks.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2606. 03040v1 Announce Type: new Abstract: Relational databases underpin modern enterprise, scientific, and healthcare systems, yet predictive machine learning on such data remains challenging due to their multi-table, heterogeneous, and temporal structure. Relational Deep Learning (RDL) addresses this by representing databases as heterogeneous graphs and applying graph neural networks (GNNs) directly.

RelBench v2 recently introduced autocomplete tasks -- a practically motivated task type where the goal is to predict an existing column value from relational context, analogous to an intelligent form-filling assistant.

We propose RelGT-AC (Relational Graph Transformer for Autocomplete), extending the RelGT architecture with three targeted contributions: (1) a column masking strategy that prevents trivial solutions by masking the target column during subgraph encoding; (2) a unified task head supporting binary classification, multiclass classification, and regression autocomplete tasks within a single model; and (3) a TF-IDF text encoder that automatically detects and encodes free-text columns, recovering strong lexical signal that categorical encoders discard.

Across 7 tasks spanning 3 RelBench v2 datasets (rel-trial, rel-f1, rel-stack), RelGT-AC outperforms the GraphSAGE baseline on all 3 regression autocomplete tasks and achieves up to +10 AUROC points on text-heavy eligibility tasks via the TF-IDF encoder.

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.AI

See more →

arXiv cs.AI·David Krongauz, Arad Zulti, Eran Segal, Teddy Lazebnik

2d ago

FeaturedOriginal

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Powered Agentic System

AI Summary

The MEDA system utilizes large language models and symbolic regression to autonomously discover ordinary differential equations for biological systems, achieving strong structural recovery and biologically plausible models. It outperforms existing methods by integrating domain knowledge and mechanistic constraints, demonstrating effective retrieval and extrapolation capabilities.

#LLM #Agent #Inference #AI Startup

RelGT-AC: A Relational Graph Transformer for Autocomplete Tasks in Relational Databases

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.AI

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Powered Agentic System

The Emerging Paradigm of Geospatial Foundation Models: From Pre-Training to Agentic Reasoning

Adversarial Social Epistemology for Assemblies of Humans and

Quick Answer

Quick Take

Key Points

Paper Resources

Article Content

Want this in your inbox every morning?

More from arXiv cs.AI

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Large Language Model Powered Agentic System

The Emerging Paradigm of Geospatial Foundation Models: From Pre-Training to Agentic Reasoning

Adversarial Social Epistemology for Assemblies of Humans and Large Language Models

Automatic Ordinary Differential Equations Discovery For Biological Systems Using Powered Agentic System

Adversarial Social Epistemology for Assemblies of Humans and