LifeSentence: Language models can encode human life course trajectories from longitudinal panel data

arXiv cs.CL·Samuel Liu, Muchen Xi, William Yeoh, Joshua J. Jackson

2d ago

·~2 min·6/11/2026·en·0

Quick Answer

LifeSentence is a novel model that integrates a 24-billion-parameter language model with longitudinal panel data, achieving a threefold improvement in joint event-and-timing prediction.

Quick Take

LifeSentence is a novel model that integrates a 24-billion-parameter language model with longitudinal panel data, achieving a threefold improvement in joint event-and-timing prediction. Trained on 65,000 individuals, it outperforms traditional methods and reveals social stratification patterns without explicit supervision. This tool enables new research queries linking early-life histories to late-life outcomes.

Key Points

LifeSentence uses structured natural-language records for life event representation.
Achieved 91.2% Kendall's tau in reconstructing chronological order from events.
Outperformed classical and deep learning baselines across all evaluated tasks.
Model trained on 65,000 individuals, significantly less than prior transformer approaches.
Enables counterfactual exploration of human biographies through a natural-language interface.

Paper Resources

Read Paperarxiv.org View PDFarxiv.org

Article Content

From source RSS / original summary

arXiv:2606. 11220v1 Announce Type: new Abstract: Forecasting human life outcomes is important to gain insights into how individuals attain long and healthy lives. Conventional statistical approaches yield limited accuracy, potentially due to discarding the sequential structure of the life course. Modern methods such as transformer architectures require large scale training data that most longitudinal panel studies lack.

Here we introduce LifeSentence, a model for life-course reasoning that bridges large language models with longitudinal panel data. By representing each life event as a structured natural-language record and instruction-tuning a pretrained 24-billion-parameter language model across an 18-task evaluation taxonomy spanning prediction, robustness and reasoning, LifeSentence supplements panel data with distributional knowledge already encoded during pretraining.

Trained on approximately 65,000 individuals from the German Socio-Economic Panel - roughly 45 times fewer than prior transformer-based approaches - LifeSentence outperforms classical and deep learning baselines across all task families, achieving a threefold improvement in joint event-and-timing prediction from best baselines and 91. 2% Kendall's tau when reconstructing chronological order from timestamp-stripped event sets.

Without explicit supervision, the model recovers documented patterns of social stratification, including the education premium, the gender wage gap and the motherhood penalty, from discrete event sequences alone. A natural-language interface further enables qualitatively new research queries, such as connecting an early-life history to a specified late-life endpoint, establishing LifeSentence as both a predictive tool and a probe for counterfactual exploration of human biographies.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CL

See more →

arXiv cs.CL·Leyao Wang, Yanan He, Peng Chen, Asaf Yehudai, Yixin Liu, Rex Ying, Michal Shmueli-Scheuer, Arman Cohan

3w ago

FeaturedOriginal

Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?

AI Summary

The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.

#LLM #Agent #Inference #Policy