Models#
Curated catalog of open-weight models served on hardware we own in Switzerland. No fine-tuning on your prompts, no hop outside the country.
Live catalog#
| Model ID | Tier | Hardware | Context | Use case |
|---|---|---|---|---|
siati/llama-3.1-405b |
fast | NVIDIA Blackwell B6000-Pro × 4 (TP=4, INT4) | 128K | Frontier-class reasoning, drop-in for GPT-4o |
siati/mistral-small-24b |
fast | NVIDIA RTX 5090 (FP8 dynamic) | 32K | Chat, summarization, coding (multilingual IT/EN/FR/DE) |
siati/bge-m3 |
embeddings | NVIDIA L4 24GB | 8K | RAG, semantic search, multilingual |
qwen2.5:7b-instruct-q4_K_M |
slow | Apple Silicon (M-series, Metal) × 2 | 32K | Batch jobs, lightweight chat, low energy |
Coming soon#
siati/qwen-72b(medium tier) — production-grade reasoningsiati/qwen-32b-coder(medium tier) — code generation specialistsiati/xtts-v2(tts tier) — multilingual text-to-speech- Whisper-class STT for audio transcription (medical visit verbalization, legal hearings, podcast captioning)
Need a model not in the catalog? Tell us — we curate based on actual customer demand.
Tiers explained#
| Tier | PAYG pricing | Free quota | Pro (19 CHF) | Max (49 CHF) |
|---|---|---|---|---|
| embeddings | 0.04 CHF / 1M | ✅ | ✅ | ✅ |
| slow | 0.40 CHF / 1M | 100K/day | 2M/day | 6M/day |
| medium | 1.50 CHF / 1M | ❌ | 2M/day | 6M/day |
| fast | 4.00 CHF / 1M | ❌ | PAYG | PAYG |
Tiers slow + medium + embeddings are included in subscriptions with a
daily cap. Tier fast (including the 405B model) requires PAYG credits.
Programmatic listing#
from openai import OpenAI
client = OpenAI(base_url="https://api.siati.ai/v1", api_key="siati_...")
models = client.models.list()
for m in models.data:
print(m.id, m.owned_by)
Returns only models accessible from your key (filtered by your plan tier + PAYG balance).
Public catalog endpoint#
https://api.siati.ai/api/v1/public/models/cards returns JSON with display
metadata (blurb, hardware, status). We use it internally for the homepage and
this page.
curl https://api.siati.ai/api/v1/public/models/cards | jq