Skip to content

Embeddings#

POST /v1/embeddings — convert text to dense vectors for semantic search, RAG, clustering, deduplication, classification.

Available model#

siati/bge-m3 — BAAI bge-m3, 1024 dimensions, 8K context, multilingual (EN/IT/DE/FR/RM/ZH/JA/KO/...). p95 latency under 50 ms.

Python example#

from openai import OpenAI

client = OpenAI(base_url="https://api.siati.ai/v1", api_key="siati_...")

texts = [
    "Il modello svizzero offre conformità nLPD.",
    "Swiss model offers nFADP compliance.",
    "L'orologio del campanile suona ogni ora.",
]

resp = client.embeddings.create(model="siati/bge-m3", input=texts)
for i, emb in enumerate(resp.data):
    print(f"text[{i}] → vector dim={len(emb.embedding)}")

TypeScript example#

const resp = await client.embeddings.create({
  model: "siati/bge-m3",
  input: ["First sentence", "Second sentence"],
});

resp.data.forEach((e, i) => console.log(`vec[${i}] dim=${e.embedding.length}`));

Pricing#

0.04 CHF / 1M tokens (PAYG). Included in every plan, even Free.

Use cases#

  • RAG over your documents — see the RAG cookbook for an end-to-end example (bge-m3 + Llama 405B).
  • Semantic search in internal knowledge bases (Confluence, Notion, file storage).
  • Deduplication of documents / tickets / customer support history.
  • Clustering for segmentation, topic discovery.

Why sovereign RAG matters

A Swiss lawyer can't send contracts to OpenAI (professional secrecy, art. 321 of the Swiss Penal Code). A hospital can't send health records to Azure (nFADP art. 5). With siati.ai the embeddings live on an L4 in Lugano: the PDF never leaves Switzerland.

End-to-end example → RAG cookbook#