Together AI GPU rental pricing
Together AI · specialist neocloud · billed per hour
Together AI is a specialist neocloud that publishes per-hour prices for 3 GPU models. Its cheapest is the NVIDIA H100 SXM at $4.79/hr on-demand. Primarily a per-token inference platform. Dedicated single-tenant GPU access via Dedicated Inference (1x) and GPU Clusters (per-GPU in 8-GPU HGX nodes). Current-gen NVIDIA only. Per-token inference has free trial credits; dedicated GPUs are pay-as-you-go.
Source: Together AI pricing. Data as of June 2026.
Together AI GPU prices
| GPU | VRAM | On-demand /hr | spot /hr | Notes |
|---|---|---|---|---|
| NVIDIA H100 SXM | 80GB HBM3 | $4.79/hr | — | GPU Cluster per-GPU; Dedicated Inference 1x $6.49; reserved from $3.29. |
| NVIDIA H200 | 141GB HBM3e | $5.99/hr | — | GPU Cluster per-GPU; reserved from $3.99. |
| NVIDIA B200 | 192GB HBM3e | $8.19/hr | — | GPU Cluster per-GPU; Dedicated Inference 1x $11.95; reserved from $6.79. |
Source: Together AI pricing page. Data as of June 2026.
Source: Together AI pricing. Snapshot June 2026 — cloud GPU prices change weekly; verify on the provider's pricing page before relying on a figure.
Cost pros & cons
Where Together AI saves money
- Strong inference stack
- Reserved discounts on clusters
- Managed software layer
Watch-outs
- Tiny GPU catalog (H100/H200/B200)
- Higher per-hour than bare neoclouds
- No spot tier
Frequently asked questions
How much does Together AI charge to rent a GPU?
Together AI's cheapest published GPU is the NVIDIA H100 SXM at $4.79/hr on-demand; it lists prices for 3 GPU models. Primarily a per-token inference platform. Dedicated single-tenant GPU access via Dedicated Inference (1x) and GPU Clusters (per-GPU in 8-GPU HGX nodes). Current-gen NVIDIA only. Snapshot June 2026 — cloud GPU prices change weekly; verify on the provider's pricing page before relying on a figure.
Does Together AI have a free tier or credits?
Per-token inference has free trial credits; dedicated GPUs are pay-as-you-go.
How is Together AI billed?
Together AI bills per hour. Primarily a per-token inference platform. Dedicated single-tenant GPU access via Dedicated Inference (1x) and GPU Clusters (per-GPU in 8-GPU HGX nodes). Current-gen NVIDIA only.
Is Together AI cheaper than the hyperscalers?
Together AI is a specialist neocloud and is generally far cheaper than AWS, GCP or Azure for the same GPU.
Compare Together AI with peers
Keep exploring
Source & accuracy
Figures are a dated snapshot from Together AI's pricing page (as of June 2026). Cloud GPU pricing is volatile and varies by region and discounts — verify on the vendor's page before renting. See our methodology. This is an informational comparison, not a quote.
Last updated: 2026-06-21