Meta open-sources Llama 4 Vision — outperforms GPT-4o on chart QA
Quick Answer
Meta has open-sourced Llama 4 Vision, a vision-language model that utilizes a Mixture-of-Experts architecture.
Quick Take
Meta has open-sourced Llama 4 Vision, a that utilizes a Mixture-of-Experts architecture. It outperforms GPT-4o on the ChartQA benchmark, scoring 87.4 compared to GPT-4o's 85.7. The model's weights, training recipe, and a 30B inference-optimized checkpoint are now available under a permissive license.
Key Points
- Llama 4 Vision uses a Mixture-of-Experts architecture for enhanced performance.
- Achieved a score of 87.4 on ChartQA, surpassing GPT-4o's 85.7.
- Weights and training recipe are available for developers to utilize.
- Includes a 30B inference-optimized checkpoint for efficient deployment.
- Released under a permissive license to encourage further research.
Article Excerpt
From source RSS / original summaryMeta released Llama 4 Vision, a with a Mixture-of-Experts architecture, under a permissive license. On ChartQA, it scores 87. 4 vs GPT-4o's 85. 7. Weights, training recipe, and a 30B inference-optimised checkpoint are available.
Want this in your inbox every morning?
Daily brief at your local 8am — bilingual EN/中文, free.