Llama 3.3 Nemotron Super 49B V1.5

NVIDIA

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

What is Llama 3.3 Nemotron Super 49B V1.5?

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Specifications

Developer: NVIDIA
Context window: 131.1K tokens
Max output: 16.4K tokens
Input modalities: text
Output modalities: text
Input price: $0.1000 per 1M tokens
Output price: $0.4000 per 1M tokens
Knowledge cutoff: 2024-03-31
Supported parameters: frequency_penalty, include_reasoning, logit_bias, max_tokens, min_p, presence_penalty, reasoning, repetition_penalty, response_format, seed, stop, temperature, tool_choice, tools, top_k, top_p

Use Llama 3.3 Nemotron Super 49B V1.5 with PromptLayer

PromptLayer lets teams manage, evaluate, and observe prompts that run on Llama 3.3 Nemotron Super 49B V1.5 alongside every other model in their stack. Version prompts, run evals across models, and ship safe rollouts from the same dashboard.

Ready to try it yourself? Sign up for PromptLayer and start managing your prompts in minutes.