Listening to the Workforce: Measuring Construction Worker Safety Attitudes from Social Media Discourse Using LLMs
Quick Take
The study developed the Construction Safety Attitude Framework (CSAF) to measure construction worker safety attitudes using a large language model (LLM) classifier, achieving high accuracy (Cohen's kappa = 0.90) on Reddit data. The CSAF can distinguish safety attitudes across topics and track their changes over time, providing a basis for targeted safety interventions.
Key Points
- CSAF integrates eight dimensions to characterize safety attitudes in construction.
- LLM classifier achieved Cohen's kappa of 0.90 on r/Construction data.
- Framework validated on 10,346 contributions from r/Roofing community.
- Study enables tracking of safety attitude shifts over time.
- Provides a basis for targeted interventions to improve worker safety.
Article Content
From source RSS / original summaryarXiv:2606. 04450v1 Announce Type: new Abstract: Worker safety attitudes are key determinants of whether protective practices are applied or bypassed on construction sites. Yet measuring them at scale has remained out of reach. Safety attitudes are multidimensional, vary across topics, and surface most candidly in workers' own conversations.
This study created and validated the Construction Safety Attitude Framework (CSAF), which integrates two components: a theory-grounded structure that characterizes safety attitudes along eight dimensions, and an operational codebook for measuring them in worker naturalistic discourse. Applying CSAF to 250 posts and comments from the r/Construction community on Reddit, trained coders reached strong agreement (Krippendorff's {\alpha} = 0. 85).
Pairwise lift and conditional probability confirmed that the eight dimensions are related yet distinct. To apply the framework across large volumes of discourse, CSAF was operationalized through a large language model (LLM) classifier. On 450 r/Construction contributions, the classifier reproduced expert human coding (Cohen's \k{appa} = 0. 90, precision = 0. 98, recall = 0. 98), and on 400 contributions from r/Roofing it retained that accuracy after transfer to a different trade community (\k{appa} = 0.
89, precision = 0. 98, recall = 0. 97). A proof-of-value case study then applied the validated classifier to 10,346 contributions from r/Roofing, demonstrating that CSAF can distinguish multidimensional attitudes by safety topic, track how they shift over time, and trace the reasoning behind unfavorable ones.
The study therefore provides a theoretically grounded, empirically vetted instrument for examining safety attitudes, offering a basis for targeted interventions that address the attitudes underlying unsafe practices.
Reader Mode unavailable (could not extract clean content).
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.
More from arXiv cs.CL
See more →Time to REFLECT: Can We Trust LLM Judges for Evidence-based Research Agents?
The REFLECT benchmark reveals that current LLM judges are unreliable, achieving below 55% accuracy in evaluating reasoning and evidence use, highlighting the need for improved evaluation methods for deep research agents.