Embeddings#
POST /v1/embeddings — convert text to dense vectors for semantic search,
RAG, clustering, deduplication, classification.
Available model#
siati/bge-m3 — BAAI bge-m3, 1024 dimensions, 8K context, multilingual
(EN/IT/DE/FR/RM/ZH/JA/KO/...). p95 latency under 50 ms.
Python example#
from openai import OpenAI
client = OpenAI(base_url="https://api.siati.ai/v1", api_key="siati_...")
texts = [
"Il modello svizzero offre conformità nLPD.",
"Swiss model offers nFADP compliance.",
"L'orologio del campanile suona ogni ora.",
]
resp = client.embeddings.create(model="siati/bge-m3", input=texts)
for i, emb in enumerate(resp.data):
print(f"text[{i}] → vector dim={len(emb.embedding)}")
TypeScript example#
const resp = await client.embeddings.create({
model: "siati/bge-m3",
input: ["First sentence", "Second sentence"],
});
resp.data.forEach((e, i) => console.log(`vec[${i}] dim=${e.embedding.length}`));
Pricing#
0.04 CHF / 1M tokens (PAYG). Included in every plan, even Free.
Use cases#
- RAG over your documents — see the RAG cookbook for an end-to-end example (bge-m3 + Llama 405B).
- Semantic search in internal knowledge bases (Confluence, Notion, file storage).
- Deduplication of documents / tickets / customer support history.
- Clustering for segmentation, topic discovery.
Why sovereign RAG matters
A Swiss lawyer can't send contracts to OpenAI (professional secrecy, art. 321 of the Swiss Penal Code). A hospital can't send health records to Azure (nFADP art. 5). With siati.ai the embeddings live on an L4 in Lugano: the PDF never leaves Switzerland.