
Community-powered inference,cheaper and open.
Route requests to contributor-run GPUs across the Jatevo network. Access open models at a fraction of the cost, pay with $JTVO, and earn tokens by contributing your own idle compute.
Two inference paths. One gateway.
Pick the lane that fits your workload. You can switch anytime.
Serverless Inference
The default — fast, private, frontier models.
- • Low latency, SLA-backed
- • Prompts never leave Jatevo infra
- • Frontier + open models (GPT, Claude, Gemini, GLM, Qwen…)
- • Pay per token
Decentralized Inference
Community-powered — cheaper, open models, best-effort.
- • Cheaper or free — $JTVO stakers get daily quota
- • Open models only (GLM, Qwen, Kimi, DeepSeek…)
- Prompts may be visible to node operators
- • Best-effort latency — contributor GPUs vary
| Feature | Serverless | Decentralized |
|---|---|---|
| Models | Frontier + open (GPT, Claude, Gemini…) | Open models only (GLM, Qwen, Kimi, DeepSeek…) |
| Speed | Low latency, SLA-backed | Best-effort — contributor GPUs vary |
| Privacy | Prompts never leave Jatevo infra | Node operators may see prompts |
| Cost | Pay per token | Cheaper or free — $JTVO stakers get quota |
| Uptime | 99.9%+ target | Depends on contributor availability |
When to choose Decentralized.
Bulk batch jobs
Run large evaluation sweeps or dataset labeling on cheaper community compute.
$JTVO-backed quota
Stake $JTVO to unlock daily request capacity on the decentralized network.
Open-model routing
Access GLM, Qwen, Kimi, DeepSeek and other open models without premium pricing.
Earn $JTVO by contributing compute.
Got idle GPUs? Run the Jatevo worker agent and earn tokens for every request you serve. The network routes traffic to contributors based on model availability, latency, and stake.
Start earning →FAQ
How is this different from Serverless Inference?
Serverless runs on Jatevo-managed infrastructure with SLAs and full privacy. Decentralized routes your requests to contributor-run GPUs — cheaper, but with variable latency and visible prompts.
Can node operators see my prompts?
Yes. On the decentralized network, contributor GPU operators can potentially view prompts. Don't send sensitive data. Use Serverless for privacy-critical workloads.
How do I contribute my GPU?
Run the Jatevo worker agent on any CUDA-compatible GPU. Your hardware earns $JTVO tokens by serving inference requests. Join the waitlist below to get early access.
Which models are available?
Open-source models only: GLM, Qwen, Kimi, DeepSeek, and other community models. Frontier models (GPT, Claude, Gemini) require Serverless Inference.
Be first on the decentralized network.
Whether you want cheaper inference or you want to earn $JTVO by contributing GPUs — join the waitlist and we'll reach out as the network goes live.
No spam. We'll only email about decentralized network access.