Llama Embed Nemotron 8B by Nvidia — Specs, Pricing & Capabilities | whichllm

Model ID nvidia/llama-embed-nemotron-8b

Provider Nvidia

Family llama

Status -

Knowledge Cutoff 2025-03

Release Date 2025-03-18

Input Modalities text

Output Modalities text

Context Window 32768

Input Limit -

Output Limit 2048

Tool Calling No

Reasoning No

Structured Output -

Temperature Control No

Open Weights No

Input Cost / 1M tokens -

Output Cost / 1M tokens -

Reasoning Cost / 1M tokens -

Cache Read Cost / 1M tokens -

Cache Write Cost / 1M tokens -