Instructions: Open the browser console (F12 or Ctrl+Shift+J) to see the demonstration results. This page runs the tier-demo.js script which demonstrates the tier system functionality.
This demonstration shows how the tier system works for an inference service, including:
| Tier | Description | Pricing | Rate Limits | Infrastructure |
|---|---|---|---|---|
| Basic | For small applications | $10/month minimum + $1.20 per 1M tokens (20% markup) TPM: 50K = ~1 Tela Inference As Service instance |
500 RPM / 50K TPM | Tela Inference As Service |
| Standard | For medium applications | $100/month minimum + $1.15 per 1M tokens (15% markup) TPM: 200K = ~1-2 Tela Inference As Service instances |
2K RPM / 200K TPM | Tela Inference As Service (priority) |
| Professional | For high-volume applications | $500/month minimum + $1.10 per 1M tokens (10% markup) TPM: 5M = ~2-4 H100 GPUs |
5K RPM / 1M TPM | Mixed (Tela Inference As Service + Dedicated) |
| Enterprise | For large enterprises | $2K/month minimum + $1.05 per 1M tokens (5% markup) TPM: 10M = ~4-8 H100 GPUs |
10K RPM / 5M TPM | Dedicated with Tela Inference As Service backup |
| Enterprise Plus | Custom enterprise solution | $5K/month minimum + $1.03 per 1M tokens (3% markup) TPM: 150M = ~12-24 H100 GPUs |
30K RPM / 10M TPM | Dedicated reserved capacity |
| Model | Basic | Standard | Professional | Enterprise | Enterprise Plus |
|---|---|---|---|---|---|
| Mistral-7B | ✓ | ✓ | ✓ | ✓ | ✓ |
| Llama-3.1-8B | ✓ | ✓ | ✓ | ✓ | ✓ |
| Llama-3.1-70B | ✓ | ✓ | ✓ | ✓ | ✓ |
| Llama-3.1-405B | ✗ | ✓ | ✓ | ✓ | ✓ |
| DeepSeek-R1-671B | ✗ | ✓ | ✓ | ✓ | ✓ |