Back
AI Models

Choosing the Right AI Model

How VULK routes across 16+ AI models — Gemini 3.1 Pro (default), Claude Opus 4.7, GPT-5.4, Grok 4.20, Llama, Mistral, DeepSeek — and how to override per-generation or BYOK with zero markup.

Choosing the Right AI Model

VULK ships 16+ AI models out of the box. The router auto-picks the best one based on file type and difficulty (Smart Model Routing — 87 % cost savings vs naive single-model), but you can override per generation. Bring Your Own Key (BYOK) is supported for 9 providers with zero markup on the underlying API calls.

Default

Gemini 3.1 Pro is the default for all new generations. Validated against an 11-gate quality probe in April 2026 — currently the only model in our bench that passes all 11 gates on first try. 1M token context window.

Provider Models (managed)

VULK includes credits for these models on every paid plan. The cost column shows credit multiplier vs the 1× baseline (Gemini 3.1 Pro).

ModelProviderContextStrengthCost
Gemini 3.1 Pro (default)Google1MMultimodal, full-stack reasoning, design fidelity
Gemini 3.1 FlashGoogle1MFast iteration, simple components0.25×
Gemini 3.1 Flash ImageGoogle1MInline image generation0.5×
Claude Opus 4.7Anthropic1MArchitecture, complex business logic
Claude Sonnet 4.6Anthropic200kSurgical edits, careful refactors
Claude Haiku 4.5Anthropic200kQuick mutations, fast turn-around0.8×
GPT-5.4OpenAI400kStrong reasoning, mathematical logic
GPT-5.4 Codex MaxOpenAI400kPure-code optimisation
Grok 4.20xAI2MLong-context, multimodal0.4×
DeepSeek V3.2DeepSeek163kCost-efficient code generation0.13×
Mistral Codestral 2Mistral262kSpeed-optimised code0.5×
Llama 4 MaverickMeta1MOpen-weights option, long context0.3×
Qwen 3.5 CoderAlibaba256kMultilingual code0.4×
MiniMax M2.1MiniMax196kAsia-region inference0.6×
GLM 5.1Zhipu200kBudget-tier0.25×
Amazon Nova 2 ProAmazon1MAWS-region routing0.7×

Need a model that's not listed? File a request at [email protected] — we add new models as they ship.

Bring Your Own Model (BYOK) — zero markup

On Pro and above, connect your own API keys at vulk.dev/settings/api-keys. VULK routes your generations through your key with zero margin — you pay the underlying provider exactly what they charge.

ProviderRequired keyNotes
OpenAIOPENAI_API_KEYAll GPT-5 family + o-series
AnthropicANTHROPIC_API_KEYOpus 4.7, Sonnet 4.6, Haiku 4.5
Google AI StudioGEMINI_API_KEYAll Gemini 3.x models
MistralMISTRAL_API_KEYCodestral, Mistral Large
GroqGROQ_API_KEYUltra-fast Llama / Mixtral inference
DeepSeekDEEPSEEK_API_KEYV3.2 + Reasoner
Together AITOGETHER_API_KEY100+ open-weight models
FireworksFIREWORKS_API_KEYProduction-tuned open models
Ollama (self-hosted)URL of your Ollama serverFor privacy-first workflows

Any OpenAI-compatible endpoint also works — point VULK at your custom URL.

Smart Model Routing (automatic)

When you don't override, VULK picks per file:

  • Config files (package.json, tsconfig.json, .env.example) → Gemini 3.1 Flash (cheap, deterministic)
  • UI components → Gemini 3.1 Pro (design fidelity)
  • Complex business logic / auth / state-machines → Claude Opus 4.7 (premium reasoning)
  • Pure code (utility functions, hooks) → GPT-5.4 Codex Max
  • Schema / type files → Sonnet 4.6 (careful + cheap)

Result: 87 % credit savings vs running everything through the premium tier.

Recommendations by use case

Landing pages & marketing sites

  • Default: Gemini 3.1 Pro (best brand-consistent design)
  • Cheap: Gemini 3.1 Flash

Full-stack SaaS dashboards

  • Default: Gemini 3.1 Pro
  • Premium: Claude Opus 4.7 (deep architecture)

3D / WebGL / shader-heavy projects

  • Default: Gemini 3.1 Pro (excellent at GLSL + Three.js)
  • Premium: Claude Opus 4.7 (when shader is mathematically complex)

Mobile apps (Flutter / React Native)

  • Default: Gemini 3.1 Pro (multimodal, handles platform APIs)
  • Surgical edits: Sonnet 4.6

Backend / auth / billing

  • Default: Claude Opus 4.7 (highest reasoning for security-critical code)
  • Cheap: GPT-5.4 Codex Max

Voice / video-to-app input modalities

  • Required: Gemini 3.1 Pro (Whisper + multimodal vision pipeline)

Budget runs

  • Best: DeepSeek V3.2 (0.13×) or GLM 5.1 (0.25×)

Provider-chain failover

Every generation goes through a circuit breaker:

Primary → OpenRouter fallback → Anthropic fallback

If the primary provider is degraded, VULK auto-fails over without dropping the generation.

Changing models

  • Per generation: Click the model name in the chat header → pick from the dropdown.
  • Default: Settings → AI Models → set your preferred default.
  • BYOK: Settings → API Keys → add your key, then it appears in the picker labelled (BYOK).

Credit multipliers

The cost column on each model shows credit usage relative to Gemini 3.1 Pro = 1×:

  • 0.13× = uses 13 % of standard credits
  • = standard
  • = premium tier

A typical full-stack landing page costs ~50 credits on Gemini 3.1 Pro. The same page on Claude Opus 4.7 costs ~250 credits. Smart Model Routing (default) keeps the cost-quality balance optimal automatically.

MCP integration

If you're using Claude Desktop, Cursor, Windsurf, or any other MCP-compatible client, install vulk-mcp-server to drive VULK from inside your AI workflow:

npx vulk-mcp-server

See MCP Server for full setup.


Questions? [email protected]

On this page

VULK Support

Online

Hi! How can I help you today?

Popular topics

AI support • support.vulk.dev