Tinfoil
Verifiable private inference with Intel TDX and NVIDIA H100 CC
Tinfoil provides confidential inference with hardware-attestable privacy guarantees. Every request runs inside an Intel TDX enclave or on an NVIDIA H100 Confidential Computing GPU. Users can cryptographically verify that their prompts are processed in a trusted execution environment and never exposed to the host operator.
API Features
OpenAI Compatible ✓
Streaming ✓
Function Calling ✓
Batch Inference —
Vision ✓
Embeddings ✓
Models & Pricing
| Model | Family | Context | Max Out | Input /M | Output /M | Free |
|---|---|---|---|---|---|---|
| GLM-5.1 | Unknown | 202K | — | $1.50 | $5.25 | — |
| Gemma 4 31B | Unknown | 256K | — | $0.45 | $1.00 | — |
| Kimi K2.5 | Moonshot AI | 256K | — | $1.50 | $5.25 | — |
| Qwen3-VL 30B | Qwen | 256K | — | $1.25 | $4.00 | — |
| Llama 3.3 70B | Meta Llama | 128K | — | $1.75 | $2.75 | — |
| GPT-OSS 120B | OpenAI GPT | 128K | — | $0.75 | $1.25 | — |
| GPT-OSS Safeguard 120B | OpenAI GPT | 128K | — | $0.50 | $1.00 | — |
| Nomic Embed Text | Nomic | 8K | — | $0.05 | $0.00 | — |
| Voxtral Small 24B | Mistral | 32K | — | $0.20 | $0.60 | — |
| Whisper Large V3 Turbo | OpenAI GPT | 29K | — | $0.01/request | — | — |
| Qwen3-TTS 1.7B | Qwen | 4K | — | $0.01/request | — | — |