Llama 3: Meta democratizes frontier models with open weights

What "open weights" means

Llama 3 isn't "open source" in the strict sense — the training code and data are not fully open. But the weights (the actual parameters of the model) are downloadable and usable. That allows: running locally, fine-tuning on your own data, doing research without depending on APIs.

License: Meta's "Llama 3 Community License" allows commercial use, with restrictions for very large companies (over 700M monthly active users) which is essentially a clause aimed at Google and OpenAI.

The three sizes and their use cases

Llama 3 8B: for local deployment, edge devices, applications that need privacy, fine-tuning on consumer hardware (single GPU).

Llama 3 70B: production sweet spot. Beats GPT-3.5 in most benchmarks. Runs on H100 servers, $0.65/M tokens in cloud hosters.

Llama 3 405B: the frontier model. Beats original GPT-4 in MMLU, GSM8K, HumanEval. Requires multi-GPU infrastructure but can be hosted by enterprise.

Performance vs proprietary

82%

MMLU
(Llama 3 405B)

88%

HumanEval
(405B)

$0.65/M

Inference cost
70B via cloud

The Llama 3 405B beats original GPT-4 (March 2023) in most public benchmarks. It loses to GPT-4o, Claude 3.5 Sonnet and Gemini 1.5 Pro in some categories, but is competitive in many. The price-performance ratio is unbeatable.

Meta's strategy: why "give away" Llama

Meta isn't naive — there's clear strategy in opening Llama. (1) Disrupts competitors: if companies use Llama instead of paying OpenAI/Anthropic, those competitors lose revenue. (2) Talent attraction: top researchers want to publish, Meta lets them. (3) Standards control: if Llama becomes default in industry, Meta influences direction. (4) Internal use: Meta uses Llama internally for Facebook, Instagram, WhatsApp — the open version just maintains it relevant.

Where Llama is the right answer

Compliance requirements: banking, healthcare, government that can't send data to OpenAI/Anthropic API. Massive volume with cost constraint: high-volume applications where per-token cost matters. Fine-tuning on private data: when the use case requires specialized domain that base models don't cover. Edge / on-device: mobile apps, IoT, isolated locations.

Conclusion

Llama 3 doesn't replace GPT-4o or Claude 3.5 in capability — but offers a different value proposition: control, privacy, cost predictability. For many enterprise use cases, the choice between Llama and proprietary depends more on operational requirements than on raw model performance.