Charting the Growth of Social-Physical HRI (spHRI): A Systematic Review Pipeline Augmented by Small Language Models
Quick Answer
This paper shows that A systematic review of social-physical human-robot interaction (spHRI) reveals that small language models (SLMs) can significantly enhance literature screening efficiency, identifying 39 papers overlooked by human reviewers, thus supporting scalable review practices.
Quick Take
A systematic review of social-physical human-robot interaction (spHRI) reveals that small language models (SLMs) can significantly enhance literature screening efficiency, identifying 39 papers overlooked by human reviewers, thus supporting scalable review practices.
Key Points
- SLMs with less than 1.5B parameters screened papers significantly faster than human reviewers.
- The combined SLM ensemble identified 39 additional relevant papers, 10.29% of the dataset.
- Results indicate SLMs can augment expert reviewers, making literature reviews more sustainable.
- Fragmented terminology in spHRI complicates systematic synthesis across various fields.
- The study highlights the potential of SLMs in enhancing large-scale review practices.
Paper Resources
Article Excerpt
From source RSS / original summaryarXiv:2606. 26382v1 Announce Type: new Abstract: Social-physical human-robot interaction (spHRI) has grown rapidly across robotics, human-computer interaction, human-robot interaction, and haptics. Yet, fragmented terminology and inconsistent methodologies make systematic synthesis difficult. To support scalable review practices, we evaluated the extent to which small language models (SLMs; < 1. 5B parameters) can assist with title and abstract screening for a large spHRI systematic review.
While no SLMs matched human reviewers' performance, the models operated locally and screened papers orders of magnitude faster. The combined SLM ensemble identified 39 papers reviewers missed, representing 10. 29% of the final relevant dataset. These results demonstrate that SLMs can augment, rather than replace, expert reviewers and make large-scale literature reviews accessible and sustainable.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Quantifying Prior Dominance in Systems
The study introduces the Normalized Context Utilization (NCU) metric to evaluate Retrieval-Augmented Generation (RAG) systems, revealing that Small Language Models (SLMs) outperform larger models in factual extraction. The findings indicate that traditional scaling laws yield diminishing returns, with a commercial API frequently failing against adversarial evidence due to systemic confidence collapse.