Question 1

How much does Together AI charge to rent a GPU?

Accepted Answer

Together AI's cheapest published GPU is the NVIDIA H100 SXM at $4.79/hr on-demand; it lists prices for 3 GPU models. Primarily a per-token inference platform. Dedicated single-tenant GPU access via Dedicated Inference (1x) and GPU Clusters (per-GPU in 8-GPU HGX nodes). Current-gen NVIDIA only. Snapshot June 2026 — cloud GPU prices change weekly; verify on the provider's pricing page before relying on a figure.

Question 2

Does Together AI have a free tier or credits?

Accepted Answer

Per-token inference has free trial credits; dedicated GPUs are pay-as-you-go.

Question 3

How is Together AI billed?

Accepted Answer

Together AI bills per hour. Primarily a per-token inference platform. Dedicated single-tenant GPU access via Dedicated Inference (1x) and GPU Clusters (per-GPU in 8-GPU HGX nodes). Current-gen NVIDIA only.

Question 4

Is Together AI cheaper than the hyperscalers?

Accepted Answer

Together AI is a specialist neocloud and is generally far cheaper than AWS, GCP or Azure for the same GPU.

GPU	VRAM	On-demand /hr	spot /hr	Notes
NVIDIA H100 SXM	80GB HBM3	$4.79/hr	—	GPU Cluster per-GPU; Dedicated Inference 1x $6.49; reserved from $3.29.
NVIDIA H200	141GB HBM3e	$5.99/hr	—	GPU Cluster per-GPU; reserved from $3.99.
NVIDIA B200	192GB HBM3e	$8.19/hr	—	GPU Cluster per-GPU; Dedicated Inference 1x $11.95; reserved from $6.79.

Together AI GPU rental pricing

Together AI GPU prices

Cost pros & cons

Where Together AI saves money

Watch-outs

Frequently asked questions

Compare Together AI with peers

Keep exploring

Source & accuracy