# GPU billing

How GPU spot rentals are billed: credit holds, minimum sessions, per-GPU pricing, idle timeouts, and hold expiry for AIVory Smart Inference.

GPU billing GPU rentals are billed from the same credit balance as chat completions. A credit hold is placed before any GPU is provisioned — you always know the cost before it starts.
Credit hold placed before provisioning, released when the rental ends 15-minute minimum session per rental Per-GPU billing — multi-GPU rentals are billed per GPU, not per VM Idle timeout terminates the GPU after 15 minutes of inactivity Credit holds When you rent a GPU (or trigger a cold-start deployment), a credit hold is placed on your account for the minimum session cost. This ensures you have sufficient credits before any hardware is provisioned.
Hold amounts depend on the GPU class and current spot pricing (typically $0.20-$0.50) Holds are deducted from your available balance (balance minus active holds) If your available balance is below the minimum session cost, you receive a 402 with the amount needed to top up Minimum session Every GPU rental has a 15-minute minimum. Even if you terminate after 5 minutes, you are billed for 15 minutes. After the minimum, billing is per-second.
Per-GPU pricing Multi-GPU rentals are billed per GPU, not per VM. If you rent 2x H100, the cost is 2x the per-GPU hourly rate.
The pricing shown in GET /v1/gpu/offers is always the per-GPU hourly rate. Multiply by the number of GPUs and the rental duration for the total cost.
Idle timeout If no requests are sent to a rented GPU for 15 minutes, it is automatically terminated. The remaining hold is released and returned to your available balance. This prevents runaway costs from forgotten instances.
Hold expiry Holds expire after 2 hours regardless of rental status. Unused held credits are returned to your available balance. This is a safety net — in practice, the hold is released when the rental terminates (either manually or via idle timeout).
Insufficient credits If your available balance can&rsquo;t cover the minimum session hold, the API returns 402 insufficient_credits:
{ &#34;error&#34;: { &#34;message&#34;: &#34;Insufficient credits. Your balance is $0.12. Add credits at https://app.aivory.net/billing.&#34;, &#34;type&#34;: &#34;insufficient_credits&#34;, &#34;code&#34;: &#34;insufficient_credits&#34; } } Check GET /v1/models — models with &quot;availability&quot;: &quot;cold&quot; will trigger the credit hold flow on first request.
Next steps GPU API reference Endpoints for browsing, renting, and managing GPUs General billing Credits, caps, auto-recharge, and receipts GPU overview When to use GPU rental and the cold-start flow 

---

Original URL: https://aivory.net/md/docs/smart-inference/gpu/billing/index.md