L2

Open SourceText Generationby Meta AI

Llama 2

Llama 2 is Meta's landmark open large language model, released in 7B, 13B and 70B sizes with chat-tuned variants. Its capable quality and broadly permissive licence made it the foundation of the modern open-LLM ecosystem.

chatbotlanguage-modelllmmeta-aiopen-source-aiself-hosted-ai

View on GitHub

Quick facts

LicenseLlama 2 Community

Params70B

TuningRLHF Chat

ByMeta

No ratings yet — be the first

Params

7B/13B/70B

open weights

Tuning

RLHF chat

+ base

Context

tokens

What is Llama 2?

Llama 2 is Meta's landmark open large language model, whose 2023 release reshaped the AI landscape. Available in 7B, 13B and 70B parameter sizes — each with a base model and a chat-tuned Llama-2-Chat variant — it combined strong, near-leading quality with a broadly permissive licence that allowed commercial use. That combination was transformative: Llama 2 became the foundation for thousands of fine-tuned models and applications, effectively launching the modern open-LLM ecosystem that everything since has built upon.

How it was built

Llama 2 is a decoder-only transformer trained on roughly 2 trillion tokens of public text, with the chat variants further refined through supervised fine-tuning and reinforcement learning from human feedback (RLHF) for helpfulness and safety. Meta invested heavily in alignment and red-teaming, publishing detailed work on safety. The result was a family that delivered competitive performance — the 70B in particular rivalled much larger or closed models of its day — across a 4K-token context.

What it is good at

Llama 2 is a strong general-purpose model: conversational chat, question answering, summarisation, reasoning, writing and some coding, with the Chat variants tuned for assistant behaviour. Its real superpower, though, is as a foundation to build on — its open weights and permissive licence made it the base for countless domain models, fine-tunes and products, and it runs across a wide range of hardware thanks to the 7B/13B/70B size ladder.

Licensing & access

Llama 2 is released under the Llama 2 Community License — permissive for the vast majority of users and commercial use, with a special-case clause for services exceeding 700M monthly active users (review the terms). Weights are on Hugging Face (with a quick access request), run locally via Ollama and Transformers, and are offered by many inference providers. The 7B runs on consumer GPUs; the 70B needs multi-GPU or quantisation.

Practical considerations

By today's standards Llama 2 is superseded by Llama 3 and other newer models on reasoning, knowledge and especially context length (its 4K window is short by current norms). Use the Chat variant for assistants and the base for fine-tuning, and verify outputs as it can hallucinate. For new projects, Llama 3.x or another current model is usually the better default — but Llama 2 remains historically pivotal and still widely deployed.

How it compares

Llama 2 competed with Falcon, MPT and BLOOM at release, generally leading on quality while offering a workable commercial licence; Vicuna and many others were fine-tuned from it. Mixtral and later models pushed efficiency further with mixture-of-experts. Llama 2's defining role is as the catalyst of the open-LLM movement — the base that proved open models could be both capable and commercially usable, paving the way for its successors.

Getting started

Pull Llama 2 from Hugging Face (after the access request) or via Ollama for instant local chat — start with Llama-2-7b-chat to prototype, or 70B-chat for more quality. Use the Chat variant for assistants and the base for fine-tuning, run quantised builds to fit your GPU, and for the strongest results today consider benchmarking against Llama 3.x or other newer open models.

Model variants

Llama 2 7B Chat

Chat

Single-GPU

Llama 2 13B Chat

13B

Chat

Balanced

Llama 2 70B Chat

70B

ChatLargest

Highest quality

Capabilities

💬

Conversational chat

Chat variants tuned with RLHF for helpful assistant behaviour.

🧩

Foundation model

Open weights made it the base for thousands of fine-tunes and products.

📐

Size ladder

7B, 13B and 70B fit hardware from consumer GPUs to multi-GPU servers.

🔓

Commercial licence

Permissive terms allow most commercial use.

Pros & Cons

Pros6

Strong quality across 7B/13B/70B sizes
Chat variants tuned with RLHF
Permissive licence allowing commercial use
The foundation of the open-LLM ecosystem
Runs widely (Ollama, Transformers, providers)
Vast community and fine-tune ecosystem

Cons4

Superseded by Llama 3 and newer models
Short 4K context window
70B needs multi-GPU or quantisation
Can hallucinate — verify outputs

Inspiration

Llama 2 use cases & project ideas

Chat assistant

Use the Chat variant.

Fine-tuning base

Build a domain model.

Text generation

Write and summarise.

RAG app

Answer over documents.

FAQ

Frequently asked questions

What is Llama 2?+

Meta's landmark open LLM in 7B, 13B and 70B sizes with chat variants, whose quality and permissive licence built the open-LLM ecosystem.

Can I use it commercially?+

What sizes and variants are there?+

What is its context length?+

Should I use Llama 2 or Llama 3?+

More to explore

Learn more

From our blog

Tutorials