Inference · Decentralized

Community-powered inference,cheaper and open.

Route requests to contributor-run GPUs across the Jatevo network. Access open models at a fraction of the cost, pay with $JTVO, and earn tokens by contributing your own idle compute.

Two inference paths. One gateway.

Pick the lane that fits your workload. You can switch anytime.

Serverless Inference

The default — fast, private, frontier models.

Default
  • Low latency, SLA-backed
  • Prompts never leave Jatevo infra
  • Frontier + open models (GPT, Claude, Gemini, GLM, Qwen…)
  • Pay per token
Use Serverless

Decentralized Inference

Community-powered — cheaper, open models, best-effort.

New
  • Cheaper or free — $JTVO stakers get daily quota
  • Open models only (GLM, Qwen, Kimi, DeepSeek…)
  • Prompts may be visible to node operators
  • Best-effort latency — contributor GPUs vary
Get early access
FeatureServerlessDecentralized
ModelsFrontier + open (GPT, Claude, Gemini…)Open models only (GLM, Qwen, Kimi, DeepSeek…)
SpeedLow latency, SLA-backedBest-effort — contributor GPUs vary
PrivacyPrompts never leave Jatevo infraNode operators may see prompts
CostPay per tokenCheaper or free — $JTVO stakers get quota
Uptime99.9%+ targetDepends on contributor availability
Use cases

When to choose Decentralized.

Bulk batch jobs

Run large evaluation sweeps or dataset labeling on cheaper community compute.

$JTVO-backed quota

Stake $JTVO to unlock daily request capacity on the decentralized network.

Open-model routing

Access GLM, Qwen, Kimi, DeepSeek and other open models without premium pricing.

For GPU owners

Earn $JTVO by contributing compute.

Got idle GPUs? Run the Jatevo worker agent and earn tokens for every request you serve. The network routes traffic to contributors based on model availability, latency, and stake.

Start earning →
1Install the Jatevo worker on any CUDA GPU
2Stake $JTVO to join the contributor pool
3Earn tokens for each inference request served
4Monitor uptime and earnings in the dashboard

FAQ

How is this different from Serverless Inference?

Serverless runs on Jatevo-managed infrastructure with SLAs and full privacy. Decentralized routes your requests to contributor-run GPUs — cheaper, but with variable latency and visible prompts.

Can node operators see my prompts?

Yes. On the decentralized network, contributor GPU operators can potentially view prompts. Don't send sensitive data. Use Serverless for privacy-critical workloads.

How do I contribute my GPU?

Run the Jatevo worker agent on any CUDA-compatible GPU. Your hardware earns $JTVO tokens by serving inference requests. Join the waitlist below to get early access.

Which models are available?

Open-source models only: GLM, Qwen, Kimi, DeepSeek, and other community models. Frontier models (GPT, Claude, Gemini) require Serverless Inference.

Be first on the decentralized network.

Whether you want cheaper inference or you want to earn $JTVO by contributing GPUs — join the waitlist and we'll reach out as the network goes live.

No spam. We'll only email about decentralized network access.