# All AI Models - Live Pricing | AIVory Smart Inference

## Pages in this section:

- [DeepSeek V3.2 API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/deepseek-v3-2/index.md) - Run DeepSeek V3.2 (671B MoE, 37B active) at the lowest live rate. 160K context, strong at code and math. From $0.14/M input tokens.

- [DeepSeek V4 Flash API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/deepseek-v4-flash/index.md) - Run DeepSeek V4 Flash at the lowest live rate. Speed-optimized MoE for high-throughput pipelines. 128K context. Credits from $10.

- [Gemma 2 9B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/gemma2-9b/index.md) - Run Google Gemma 2 9B at the lowest live rate. The smallest model in the catalogue. 8K context. For high-volume, simple tasks. Credits from $10.

- [Gemma 4 31B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/gemma-4-31b/index.md) - Run Google Gemma 4 31B at the lowest live rate. 128K context, strong instruction following. Fits on a single GPU. Credits from $10.

- [Llama 3.3 70B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/llama-3-3-70b/index.md) - Run Meta Llama 3.3 70B at the lowest live rate. The most widely deployed open-weight model. 128K context. From $0.18/M input tokens.

- [Llama 4 Maverick API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/llama-4-maverick/index.md) - Run Meta Llama 4 Maverick (400B MoE, 17B active) through the cheapest live provider. One API call, no infrastructure. From $0.13/M input tokens.

- [Llama 4 Scout API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/llama-4-scout/index.md) - Run Meta Llama 4 Scout (109B MoE, 17B active) at the lowest live rate. Good balance of quality and cost. 128K context. Credits from $10.

- [Mistral Nemo 12B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/mistral-nemo-instruct/index.md) - Run Mistral Nemo 12B at the lowest live rate. 128K context in a 12B model. Fast, cheap, good for high-volume extraction. Credits from $10.

- [Mistral Small 24B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/mistral-small-24b-2501/index.md) - Run Mistral Small 24B at the lowest live rate. Strong at function calling and structured JSON output. 32K context. Credits from $10.

- [Mistral Small 3.2 24B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/mistral-small-3-2-24b-2506/index.md) - Run Mistral Small 3.2 24B at the lowest live rate. Vision input + function calling + 128K context in one model. Credits from $10.

- [Mixtral 8x7B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/mixtral-8x7b/index.md) - Run Mistral Mixtral 8x7B (46.7B MoE, 12.9B active) at the lowest live rate. Apache 2.0 license. 32K context. Credits from $10.

- [Qwen 2.5 72B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/qwen-2-5-72b/index.md) - Run Alibaba Qwen 2.5 72B at the lowest live rate. Top multilingual and math benchmarks. 128K context. Credits from $10.

- [Qwen 3 235B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/qwen-3-235b/index.md) - Run Alibaba Qwen 3 235B (MoE, 22B active) at the lowest live rate. Frontier reasoning with open weights. 128K context. Credits from $10.

- [Qwen 3.5 27B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/qwen-3-5-27b/index.md) - Run Alibaba Qwen 3.5 27B at the lowest live rate. Tuned for agentic coding and tool use. 128K context. Credits from $10.

- [Voxtral Mini 3B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/voxtral-mini-3b-2507/index.md) - Run Mistral Voxtral Mini 3B at the lowest live rate. Voice input in a 3B model. Smallest model in the catalogue. Credits from $10.

- [Voxtral Small 24B API - Cheapest Provider | AIVory Smart Inference](https://aivory.net/md/smart-inference/models/voxtral-small-24b-2507/index.md) - Run Mistral Voxtral Small 24B at the lowest live rate. Multilingual voice understanding in a 24B model. 32K context. Credits from $10.


---

Original URL: https://aivory.net/md/smart-inference/models/index.md