Inference · Decentralized

Community-powered inference,cheaper and open.

Route requests to contributor-run GPUs across the Jatevo network. Access open models at a fraction of the cost, pay with $JTVO, and earn tokens by contributing your own idle compute.

Join the waitlist Explore Serverless →

Two inference paths. One gateway.

Pick the lane that fits your workload. You can switch anytime.

Serverless Inference

The default — fast, private, frontier models.

Default

• Low latency, SLA-backed
• Prompts never leave Jatevo infra
• Frontier + open models (GPT, Claude, Gemini, GLM, Qwen…)
• Pay per token

Use Serverless

Decentralized Inference

Community-powered — cheaper, open models, best-effort.

New

• Cheaper or free — $JTVO stakers get daily quota
• Open models only (GLM, Qwen, Kimi, DeepSeek…)
Prompts may be visible to node operators
• Best-effort latency — contributor GPUs vary

Get early access

Feature	Serverless	Decentralized
Models	Frontier + open (GPT, Claude, Gemini…)	Open models only (GLM, Qwen, Kimi, DeepSeek…)
Speed	Low latency, SLA-backed	Best-effort — contributor GPUs vary
Privacy	Prompts never leave Jatevo infra	Node operators may see prompts
Cost	Pay per token	Cheaper or free — $JTVO stakers get quota
Uptime	99.9%+ target	Depends on contributor availability

Use cases

When to choose Decentralized.

Bulk batch jobs

Run large evaluation sweeps or dataset labeling on cheaper community compute.

$JTVO-backed quota

Stake $JTVO to unlock daily request capacity on the decentralized network.

Open-model routing

Access GLM, Qwen, Kimi, DeepSeek and other open models without premium pricing.

For GPU owners

Earn $JTVO by contributing compute.

Got idle GPUs? Run the Jatevo worker agent and earn tokens for every request you serve. The network routes traffic to contributors based on model availability, latency, and stake.

Start earning →

1Install the Jatevo worker on any CUDA GPU

2Stake $JTVO to join the contributor pool

3Earn tokens for each inference request served

4Monitor uptime and earnings in the dashboard

FAQ

How is this different from Serverless Inference?

Serverless runs on Jatevo-managed infrastructure with SLAs and full privacy. Decentralized routes your requests to contributor-run GPUs — cheaper, but with variable latency and visible prompts.

Can node operators see my prompts?

Yes. On the decentralized network, contributor GPU operators can potentially view prompts. Don't send sensitive data. Use Serverless for privacy-critical workloads.

How do I contribute my GPU?

Run the Jatevo worker agent on any CUDA-compatible GPU. Your hardware earns $JTVO tokens by serving inference requests. Join the waitlist below to get early access.

Which models are available?

Open-source models only: GLM, Qwen, Kimi, DeepSeek, and other community models. Frontier models (GPT, Claude, Gemini) require Serverless Inference.

Be first on the decentralized network.

Whether you want cheaper inference or you want to earn $JTVO by contributing GPUs — join the waitlist and we'll reach out as the network goes live.

No spam. We'll only email about decentralized network access.