GazeBehavior Annotation Toolkit (GBAT): AI-powered toolkit for automatic annotation of egocentric eye-tracking and video data of child-caregiver interaction

arXiv cs.CV·Iba Baig, Kevin Li, Yanbin Xu, Seiji Cattelain, Marie Hallo, Hayato Ono, Sho Tsuji, Ming Bo Cai

5d ago

·~1 min·5/25/2026·en·2

Quick Take

The GazeBehavior Annotation Toolkit (GBAT) utilizes deep learning to automate the annotation of egocentric eye-tracking and video data in child-caregiver interactions, enhancing efficiency in data preprocessing and feature extraction. This toolkit supports large-scale studies of attentional dynamics and naturalistic behavior in early childhood development.

Key Points

GBAT automates post-hoc synchronization of multiple video recordings.
It offers semi-automatic annotation of gaze target categories.
The toolkit categorizes participants' poses and hand actions.
GBAT enhances scalability for longitudinal studies in child development.
It significantly reduces the time required for manual data annotation.

Article Excerpt

From source RSS / original summary

arXiv:2605. 22962v1 Announce Type: new Abstract: Video recordings of child-caregiver interactions enable investigation of attentional dynamics during naturalistic behavior. Such multimodal recording also allows researchers to examine how attention interacts with action and language use in real time. However, manual annotation of such data is time-consuming.

Here, we introduce GazeBehavior Annotation Toolkit, a deep-learning-based toolkit designed to facilitate three key processes in data preprocessing and feature extraction: post-hoc synchronization across multiple videos, semi-automatic annotation of gaze target categories, and categorization of participants' poses and hand actions. This toolkit improves the efficiency and scalability of feature extraction from human egocentric eye-tracking and video data.

Such improvement is critical in supporting large-scale and longitudinal investigations of attentional dynamics and naturalistic behavior in human early development.

Reader Mode unavailable (could not extract clean content).

Read on arxiv.org

Want this in your inbox every morning?

Daily brief at your local 8am — bilingual EN/中文, free.

Subscribe — it's free

More from arXiv cs.CV

See more →

arXiv cs.CV·Taha Koleilat, Hassan Rivaz, Yiming Xiao

3d ago

FeaturedOriginal

Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning

AI Summary

Evi-Steer introduces a novel evidential tuning framework for BiomedCLIP, achieving 0.11% parameter updates while enhancing uncertainty-aware fine-tuning. It outperforms state-of-the-art methods across 15 biomedical imaging datasets, proving effective in few-shot learning and domain shifts for clinical applications.

#AI Coding #Inference #Open Source