Jatevo model catalog

Serverless model cards for the routes behind Jatevo.

Browse every playground-ready and request-access model route with provider logos, pricing markers, deployment type, and capability tags in one place.

Open playground API docs

Browse Models

Find model routes by provider, capability, or deployment style.

10 models

DeepSeek

Reasoning

DeepSeek V4 Pro

A high-capacity reasoning route for agentic workflows, coding tasks, and production chat traffic through Jatevo.

Input

$1.74

Output

$3.48

Speed

Fast

ServerlessGlobalRequest access

Z.ai

Code

GLM 5.2

GLM 5.2 is Jatevo's dedicated Z.ai route for long-running software, agent, and tool-use sessions.

Input

$1.40

Output

$4.40

Context

ServerlessAPACTry model

NVIDIA

NewReasoning

NVIDIA Nemotron 3 Ultra 550B A55B NVFP4

Nemotron 3 Ultra is a 550B hybrid MoE model from NVIDIA, optimized for demanding multi-agent AI and complex reasoning tasks.

Input

$0.60

Output

$3.60

Speed

59 Tok/s

Serverlessus-central1Try model

Jatevo Inference

NewReasoning

Kimi K3

Kimi K3 provides maximum-effort reasoning and a 1,048,576-token context window through Jatevo API Master.

Input

$0.75

Output

$3.50

Context

ServerlessGlobalTry model

Jatevo Inference

NewCode

Kimi K2.7 Code

Kimi K2.7 Code runs on Jatevo Inference for long-context software work, agent execution, and production chat workloads.

Input

$0.75

Output

$3.50

Focus

Agentic

ServerlessGlobalTry model

Alibaba Cloud

Chat

Qwen 3.7 Max

A Qwen Max route for text-only enterprise workflows, exposed through Jatevo with scoped key enforcement.

Input

$1.25

Output

$3.75

Speed

Fast

ServerlessAPACTry model

Cerebras

Chat

Cerebras GLM 4.7

GLM 4.7 served on the existing Cerebras route for low-latency prompt iteration and streaming chat sessions.

Input

Low

Output

Low

Speed

Ultra-fast

ServerlessGlobalTry model

Cerebras

NewChat

Cerebras Gemma 4 31B

Gemma 4 31B runs through the existing Cerebras playground route for fast streaming chat and long completion tests.

Input

$1.00

Output

$1.50

Output cap

Fast

ServerlessGlobalTry model

Spark

NewChat

Spark Gemma 4 26B A4B

Spark Gemma 4 26B A4B is exposed through API Master as a serverless Gemma route for lightweight chat and generation workloads.

Input

Metered

Output

Metered

Route

Fast

ServerlessGlobalTry model

RTX PRO 6000

NewChat

RTX Qwen3.6 35B A3B NVFP4

Qwen3.6 35B A3B NVFP4 runs on Jatevo's RTX PRO 6000 route for high-memory chat and long-context inference through API Master.

Input

Metered

Output

Metered

Context

262K

ServerlessDedicated GPUTry model