Meta·💬 Text Generation

Llama 3.3 70B

Function CallingWeb Searchfp8private

🧠 Try in Intelligence →

Try on Venice.ai ↗

Quick reference

Llama 3.3 70B — TLDR

🧠 70B instruction-tuned, text-only model from Meta
📏 128,000-token context window
🌐 Multilingual across eight officially supported languages
🔧 Function-calling and tool use, plus Venice web-search capability
⚡ FP8 quantization for efficient deployment at this scale
📚 Pretrained on roughly 15 trillion tokens of public data
💬 Tuned for multilingual dialogue and assistant-style chat
🔒 Released under Meta's Llama 3.3 Community License

💰 Pricing

$0.700 / $2.80

per 1M · input / output

📏 Context

128K tokens

📅 On Venice since

Apr 6, 2025

424 days ago

Provider

About this model

Llama 3.3 70B is Meta's text-only, instruction-tuned large language model in the Llama family, optimized for multilingual dialogue, assistant-style chat, and tool use. It was pretrained on approximately 15 trillion tokens from publicly available sources, with fine-tuning drawing on public instruction datasets plus large volumes of synthetically generated examples. Here it ships as an FP8 quantization with a 128,000-token context window, and adds function-calling and web-search capabilities.

Within the same family, it sits well above the compact Llama 3.2 3B, a much smaller model aimed at lightweight, on-device use, and it precedes the community fine-tune Hermes 3 Llama 3.1 405b, a far larger 405B variant built on a different base.

Compared with the earlier Llama 3.1 generation, Llama 3.3 70B is positioned by Meta's model card as a continued iteration of the 70B line, retaining the same 128,000-token context length and supporting tool use and code-interpreter style workflows. It is multilingual, with eight officially supported languages.

Note that Llama 3.3 is offered only as an instruction-tuned model; there is no separately released pretrained checkpoint in this revision. Released under Meta's Llama 3.3 Community License, it remains one of the most widely downloaded open-weight Llama checkpoints on Hugging Face.

🤗View model card on HuggingFace ↗View source on GitHub ↗

Sources

llama-3.3-70b-instruct Model by Metabuild.nvidia.com ↗

Introducing Llama 3.1: Our most capable models to dateai.meta.com ↗

meta-llama/Llama-3.3-70B-Instruct · Hugging Facehuggingface.co ↗

This About section is AI-generated from public sources (Claude Opus 4.8), with no human editing. It may contain inaccuracies — verify critical details against the sources listed above.

Research & Papers

Primary reference paper for this model family, sourced from the HuggingFace model card.

arXiv2204.05149Apr 2022

The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink(2022)

David Patterson, Joseph Gonzalez, Urs Hölzle et al.

Machine Learning (ML) workloads have rapidly grown in importance, but raised concerns about their carbon footprint. Four best practices can reduce ML training energy by up to 100x and CO2 emissions up to 1000x. By following best practices, overall ML energy use (across research,…

Data sources: Venice API · HuggingFace · Wikipedia · arXiv — enrichment updated 1d ago