Skip to content

Errors#

The API returns standard HTTP codes. All errors return an OpenAI-compatible JSON shape:

{
  "error": {
    "type": "invalid_request_error",
    "code": "model_not_found",
    "message": "Model 'siati/foo' not available for your API key."
  }
}

Status code table#

HTTP Type When
400 Bad request Malformed JSON, unknown field, invalid parameter.
401 Unauthorized Missing / invalid / revoked API key.
402 Payment Required PAYG credits exhausted. Top up from /dashboard/billing.
404 Model not found Slug not in /v1/models for your tier.
413 Payload too large Input exceeds the model's context window.
429 Too many requests Rate limit exceeded. Honour Retry-After.
500 Server error Bug or panic on our side. Reported automatically.
502 Bad gateway Upstream inference engine failed — retry idempotently.
503 Unavailable Backend draining or capacity-saturated. Retry with backoff.

Plan quota exhausted#

If you're on Free/Pro/Max and your daily quota is depleted, you get a 429 with extended shape:

{
  "error": {
    "type": "quota_exhausted",
    "code": "daily_quota_exhausted",
    "message": "Daily quota exhausted for the Pro plan.",
    "details": {
      "plan": "pro",
      "quota_daily": 2000000,
      "quota_used": 2000000,
      "resets_in_seconds": 18432
    }
  }
}

Retry-After is populated with resets_in_seconds (UTC midnight rollover).

Solutions:

  • Wait for the reset
  • Upgrade to the Max plan (6M/day)
  • Buy PAYG credits and use them for the surplus (stackable with the plan)
import time, random
from openai import OpenAI, RateLimitError, APIStatusError

client = OpenAI(base_url="https://api.siati.ai/v1", api_key="siati_...")

def call_with_retry(messages, max_attempts=5):
    for attempt in range(max_attempts):
        try:
            return client.chat.completions.create(
                model="siati/mistral-small-24b",
                messages=messages,
            )
        except RateLimitError as e:
            wait = int(e.response.headers.get("retry-after", 2 ** attempt))
            time.sleep(wait + random.uniform(0, 1))
        except APIStatusError as e:
            if 500 <= e.status_code < 600:
                time.sleep(2 ** attempt + random.uniform(0, 1))
            else:
                raise
    raise RuntimeError("Max retries exceeded")

Use exponential backoff with jitter — we're kind to well-behaved clients but we ratchet up rate-limit enforcement on those that ignore Retry-After.

Asking for help#

If you see a recurring 5xx, email hello@siati.ai with:

  • The X-Request-ID from the response (always returned)
  • UTC timestamp of the issue
  • Request body (redact API keys)

We respond within 1 business day (Standard SLA), 4h (Business), 30 min (Critical).