Quick facts

Model ID llama-4-maverick-17b-128e-instruct-fp8

Source Azure Cognitive Services

Context Window 128000

Pricing $0.25 input / $1.00 output per 1M tokens

Capabilities tool calling, temperature control, open weights

Model overview

Llama 4 Maverick 17B 128E Instruct FP8 is an AI model from Azure Cognitive Services with 128000 token context window and text, image input support.

Published pricing is $0.25 input and $1.00 output per 1M tokens.

Model ID llama-4-maverick-17b-128e-instruct-fp8

Provider Azure Cognitive Services

Family llama

Status -

Knowledge Cutoff 2024-08

Release Date 2025-04-05

Input Modalities text, image

Output Modalities text

Context Window 128000

Input Limit -

Output Limit 8192

Tool Calling Yes

Reasoning No

Structured Output -

Temperature Control Yes

Open Weights Yes

Input Cost / 1M tokens $0.25

Output Cost / 1M tokens $1.00

Reasoning Cost / 1M tokens -

Cache Read Cost / 1M tokens -

Cache Write Cost / 1M tokens -