All
Featured
Latest
Daily
Saved
Subscribe
Sources
Feedback

All
Featured
Daily
Saved
Feedback

Show HN: Tiny 1B param model that beats GPT-3.5 on JSON extraction · DeepSignal

Show HN: Tiny 1B param model that beats GPT-3.5 on JSON extraction

Hacker News·indie_dev

4d ago

·~1 min·5/13/2026·en·1

Quick Take

Indie 1B Llama-3 derivative trained on synthetic data beats GPT-3.5 on JSON extraction at 80 tok/s on a single 4090.

Key Points

200K synthetic JSON-extraction training examples.
Beats GPT-3.5 on a 10-task held-out benchmark.
Runs at 80 tok/s on a 4090; open-sourced.

Reader Mode is being prepared.

Read on news.ycombinator.com

More from Hacker News

Hacker News

4d ago

Cursor reaches $500M ARR run-rate

AI Summary

Cursor has hit a $500M ARR run-rate, doubling in five months with 40% from enterprise.

#AI Coding #AI Startup #Enterprise AI

0

Hacker News

Hacker News·kawaii

4d ago

Show HN: Pico — open-source on-device LLM router for AI coding agents

AI Summary

Pico routes coding-agent requests between local and remote LLMs, cutting cost 62% with a marginal accuracy drop.

#Agent #AI Coding #Open Source

2

📰 Read Original

47signal

Signal Score

Low signal — niche or repeat coverage.

WeightScore

Source authority20%75

Community heat20%0

Technical impact30%

📰 Read Original

Hacker News

4d ago

What's the actual cost of running a 70B Llama on AWS?

AI Summary

70B Llama 3.1 on AWS g5.48xlarge with vLLM costs $0.31/M tokens at 50% utilisation, $0.18 at 80%.

#Inference #GPU

1

Related in this space

arXiv cs.AI

arXiv cs.AI·Hiroki Fukui

2d ago

FeaturedOriginal

Invisible Orchestrators Suppress Protective Behavior and Dissociate Power-Holders: Safety Risks in Multi-Agent LLM Systems

AI Summary

Invisible orchestrators in multi-agent LLM systems pose significant safety risks and affect behavior dynamics.

#LLM #Agent #Security

2

arXiv cs.CL

arXiv cs.CL·Chengzhi Liu, Yichen Guo, Yepeng Liu, Yuzhe Yang, Qianqi Yan, Xuandong Zhao, Wenyue Hua, Sheng Liu, Sharon Li, Yuheng Bu, Xin Eric Wang

2d ago

FeaturedOriginal

Auditing Agent Harness Safety

AI Summary

HarnessAudit framework evaluates safety in LLM agent execution, revealing risks in multi-agent systems.

#LLM #Agent #Security

3

arXiv cs.CL

arXiv cs.CL·Mokshit Surana, Archit Rathod, Akshaj Satishkumar

2d ago

FeaturedOriginal

Measuring and Mitigating Toxicity in Large Language Models: A Comprehensive Replication Study

AI Summary

This study evaluates DExperts for mitigating toxicity in LLMs, revealing strengths and weaknesses in safety and latency.

#LLM #Open Source #Security

1

67

Business impact20%0

Novelty (recency)10%25

≥75 high · 50–74 medium · <50 low

Why Featured

Small specialised models continue to eat the boring-but-high-volume LLM workloads — a recurring signal worth watching.

Tags

#LLM #Open Source

Reactions