Quick facts

Model ID nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B

Source Baseten

Context Window 202800

Pricing $0.60 input / $2.40 output per 1M tokens

Capabilities tool calling, reasoning, structured output, temperature control, open weights

Model overview

Nemotron Ultra is an AI model from Baseten with 202800 token context window and text input support.

Published pricing is $0.60 input and $2.40 output per 1M tokens.

Model ID nvidia/NVIDIA-Nemotron-3-Ultra-550B-A55B

Provider Baseten

Family nemotron

Status -

Knowledge Cutoff -

Release Date 2026-06-04

Input Modalities text

Output Modalities text

Context Window 202800

Input Limit -

Output Limit 202800

Tool Calling Yes

Reasoning Yes

Structured Output Yes

Temperature Control Yes

Open Weights Yes

Input Cost / 1M tokens $0.60

Output Cost / 1M tokens $2.40

Reasoning Cost / 1M tokens -

Cache Read Cost / 1M tokens $0.12

Cache Write Cost / 1M tokens -