Models
Every chat lets you pick from 8 models served via NVIDIA NIM. 3 are free; the rest unlock on Pro.
Free models
- GPT-OSS 120B (default) — balanced & fast, ~131K context window, up to 120,000 output tokens
- GPT-OSS 20B — lightweight GPT-OSS, up to 120,000 output tokens
- Mistral Small 4 — fastest response
Pro models
- Mistral Large 3 — largest, highest quality
- Llama 4 Maverick — fast & accurate, up to 16,384 output tokens
- GLM 5.1 — strong reasoning
- Qwen 3.5 122B — balanced Qwen
- Kimi K2.6 — long-context reasoning