Errors#

The API returns standard HTTP codes. All errors return an OpenAI-compatible JSON shape:

{
  "error": {
    "type": "invalid_request_error",
    "code": "model_not_found",
    "message": "Model 'siati/foo' not available for your API key."
  }
}

Status code table#

HTTP	Type	When
400	Bad request	Malformed JSON, unknown field, invalid parameter.
401	Unauthorized	Missing / invalid / revoked API key.
402	Payment Required	PAYG credits exhausted. Top up from `/dashboard/billing`.
404	Model not found	Slug not in `/v1/models` for your tier.
413	Payload too large	Input exceeds the model's context window.
429	Too many requests	Rate limit exceeded. Honour `Retry-After`.
500	Server error	Bug or panic on our side. Reported automatically.
502	Bad gateway	Upstream inference engine failed — retry idempotently.
503	Unavailable	Backend draining or capacity-saturated. Retry with backoff.

Plan quota exhausted#

If you're on Free/Pro/Max and your daily quota is depleted, you get a 429 with extended shape:

{
  "error": {
    "type": "quota_exhausted",
    "code": "daily_quota_exhausted",
    "message": "Daily quota exhausted for the Pro plan.",
    "details": {
      "plan": "pro",
      "quota_daily": 2000000,
      "quota_used": 2000000,
      "resets_in_seconds": 18432
    }
  }
}

Retry-After is populated with resets_in_seconds (UTC midnight rollover).

Solutions:

Wait for the reset
Upgrade to the Max plan (6M/day)
Buy PAYG credits and use them for the surplus (stackable with the plan)

Recommended retry strategy#

import time, random
from openai import OpenAI, RateLimitError, APIStatusError

client = OpenAI(base_url="https://api.siati.ai/v1", api_key="siati_...")

def call_with_retry(messages, max_attempts=5):
    for attempt in range(max_attempts):
        try:
            return client.chat.completions.create(
                model="siati/mistral-small-24b",
                messages=messages,
            )
        except RateLimitError as e:
            wait = int(e.response.headers.get("retry-after", 2 ** attempt))
            time.sleep(wait + random.uniform(0, 1))
        except APIStatusError as e:
            if 500 <= e.status_code < 600:
                time.sleep(2 ** attempt + random.uniform(0, 1))
            else:
                raise
    raise RuntimeError("Max retries exceeded")

Use exponential backoff with jitter — we're kind to well-behaved clients but we ratchet up rate-limit enforcement on those that ignore Retry-After.

Asking for help#

If you see a recurring 5xx, email hello@siati.ai with:

The X-Request-ID from the response (always returned)
UTC timestamp of the issue
Request body (redact API keys)

We respond within 1 business day (Standard SLA), 4h (Business), 30 min (Critical).