Errors#
The API returns standard HTTP codes. All errors return an OpenAI-compatible JSON shape:
{
"error": {
"type": "invalid_request_error",
"code": "model_not_found",
"message": "Model 'siati/foo' not available for your API key."
}
}
Status code table#
| HTTP | Type | When |
|---|---|---|
| 400 | Bad request | Malformed JSON, unknown field, invalid parameter. |
| 401 | Unauthorized | Missing / invalid / revoked API key. |
| 402 | Payment Required | PAYG credits exhausted. Top up from /dashboard/billing. |
| 404 | Model not found | Slug not in /v1/models for your tier. |
| 413 | Payload too large | Input exceeds the model's context window. |
| 429 | Too many requests | Rate limit exceeded. Honour Retry-After. |
| 500 | Server error | Bug or panic on our side. Reported automatically. |
| 502 | Bad gateway | Upstream inference engine failed — retry idempotently. |
| 503 | Unavailable | Backend draining or capacity-saturated. Retry with backoff. |
Plan quota exhausted#
If you're on Free/Pro/Max and your daily quota is depleted, you get a 429
with extended shape:
{
"error": {
"type": "quota_exhausted",
"code": "daily_quota_exhausted",
"message": "Daily quota exhausted for the Pro plan.",
"details": {
"plan": "pro",
"quota_daily": 2000000,
"quota_used": 2000000,
"resets_in_seconds": 18432
}
}
}
Retry-After is populated with resets_in_seconds (UTC midnight rollover).
Solutions:
- Wait for the reset
- Upgrade to the Max plan (6M/day)
- Buy PAYG credits and use them for the surplus (stackable with the plan)
Recommended retry strategy#
import time, random
from openai import OpenAI, RateLimitError, APIStatusError
client = OpenAI(base_url="https://api.siati.ai/v1", api_key="siati_...")
def call_with_retry(messages, max_attempts=5):
for attempt in range(max_attempts):
try:
return client.chat.completions.create(
model="siati/mistral-small-24b",
messages=messages,
)
except RateLimitError as e:
wait = int(e.response.headers.get("retry-after", 2 ** attempt))
time.sleep(wait + random.uniform(0, 1))
except APIStatusError as e:
if 500 <= e.status_code < 600:
time.sleep(2 ** attempt + random.uniform(0, 1))
else:
raise
raise RuntimeError("Max retries exceeded")
Use exponential backoff with jitter — we're kind to well-behaved clients but
we ratchet up rate-limit enforcement on those that ignore Retry-After.
Asking for help#
If you see a recurring 5xx, email hello@siati.ai with:
- The
X-Request-IDfrom the response (always returned) - UTC timestamp of the issue
- Request body (redact API keys)
We respond within 1 business day (Standard SLA), 4h (Business), 30 min (Critical).