whichllm — Browse and compare AI model specs and pricing

Azure Cognitive Services

Llama 4 Maverick 17B 128E Instruct FP8

llama

Model ID llama-4-maverick-17b-128e-instruct-fp8
Provider Azure Cognitive Services
Family llama
Status -
Knowledge Cutoff 2024-08
Release Date 2025-04-05
Input Modalities text, image
Output Modalities text
Context Window 128000
Input Limit -
Output Limit 8192
Tool Calling Yes
Reasoning No
Structured Output -
Temperature Control Yes
Open Weights Yes
Input Cost / 1M tokens $0.25
Output Cost / 1M tokens $1.00
Reasoning Cost / 1M tokens -
Cache Read Cost / 1M tokens -
Cache Write Cost / 1M tokens -