AI inference API
The AI inference API powers the Schema designer in the dashboard. It exposes the same three-step flow — profile, propose, render — as endpoints you can call from your own code. Useful when you want to automate cube generation across many warehouses, embed cube authoring into your own product, or integrate Saiku Cloud into an upstream onboarding flow.
All endpoints require a Bearer API key. See Authentication for the basics.
The flow
The dashboard’s Schema designer is the canonical reference implementation:
- Profile a connection or an uploaded file. We sample its
structure cheaply and return a
SchemaProfile. - Propose a cube. We send the profile + an optional plain-
English intent to Claude, and return a structured
CubeProposal. - Render the proposal as Mondrian schema XML. The result is a string ready to save as a schema in your workspace.
You can call the steps independently — profile once and propose
several times with different intents, or skip the propose step and
hand-roll a CubeProposal for the renderer.
POST /me/inference/profile/connection/{id}
Profile a saved warehouse connection. Reads information_schema,
samples a few rows per column, returns a structured profile of
the warehouse’s tables and columns.
Path parameters:
id— the connection ID fromGET /me/connections.
Optional query parameters:
schema— restrict to a specific database schema (e.g.public,analytics). Defaults to whatever schema the connection was saved against.maxTables— cap the number of tables profiled. Default 20.tableTypes— comma-separated list (TABLE,VIEW,MATERIALIZED VIEW). Defaults toTABLEonly.
Response (200):
{ "databaseProductName": "PostgreSQL", "databaseProductVersion": "16.6", "tables": [ { "schema": "public", "name": "fact_sales", "rowCount": 1245678, "columns": [ { "name": "sale_id", "type": "BIGINT", "nullable": false, "distinctApprox": 1245678, "sampleValues": [1, 2, 3, 4, 5] }, { "name": "customer_id", "type": "BIGINT", "nullable": false, "distinctApprox": 25000, "sampleValues": [101, 102, 103, 104, 105] } ] } ], "sampledAt": "2026-05-23T18:00:00Z", "sampleDurationMillis": 1832, "sampleCostUsd": 0.001}Cost: typically under $0.05 per profile against a Postgres warehouse. Most of the cost is metadata queries, not data scans.
Failure modes:
404 not_found— connection ID doesn’t exist or isn’t visible to your tenant.502 warehouse_unreachable— we couldn’t connect to the warehouse (credentials wrong, host down, network issue).
GET /me/inference/profile/connection/{id}/sample
Fetch a few sample rows from one table. Same auth and reachability shape as the profile endpoint, narrower scope.
Path + query parameters:
id— connection ID.table— table name (required).schema— database schema (defaults to the connection’s default).rows— number of rows (default 5, max 50).
Response (200):
{ "schema": "public", "table": "fact_sales", "columns": ["sale_id", "customer_id", "amount", "sale_date"], "rows": [ [1, 101, "29.99", "2024-01-15"], [2, 102, "149.00", "2024-01-15"] ]}Useful for showing the user what their data actually looks like before they commit to a cube design.
POST /me/inference/profile/file
Profile an uploaded file instead of a warehouse table. Same shape as the connection profiler, with the file ID standing in for the connection ID.
Multipart body:
file— the file part..parquet,.csv, or.json.tableTypes— defaults toTABLE(same shape as the connection profiler, allows future expansion).
Response (200): same SchemaProfile shape as the connection
profiler. Files surface as a single tables[0] entry.
DuckDB’s httpfs reads the file column-by-column, so a multi-GB
Parquet file profiles in seconds without us pulling the whole
thing into memory.
POST /me/inference/propose
Send a profile to Claude and get a cube proposal back.
Request body:
{ "profile": { /* SchemaProfile from the profile step */ }, "intent": "Sales facts joined to customer and product dimensions, sum of revenue, count of orders.", "factTable": { "schema": "public", "name": "fact_sales" }}profile(required) — the JSON returned by either profile endpoint.intent(optional but strongly recommended) — one-line plain- English description of what you want. Without it Claude has much less to work with.factTable(optional) — pin a specific fact table. Without it Claude picks one based on naming heuristics and column shapes.
Response (200):
{ "traceId": "ac2ab729-7fc3-4c3a-8e36-7cd77be87267", "proposal": { "schemaName": "Sales Analytics", "cubes": [ { "name": "Sales", "factTableSchema": "public", "factTableName": "fact_sales", "measures": [ { "name": "Revenue", "column": "amount", "aggregator": "sum" }, { "name": "Orders", "column": "sale_id", "aggregator": "count" } ], "dimensions": [ { "name": "Customer", "foreignKey": "customer_id", "tableSchema": "public", "tableName": "dim_customer", "primaryKey": "customer_id", "levels": [ { "name": "Name", "column": "customer_name", "type": "String", "uniqueMembers": false } ] } ] } ] }, "mondrianXml": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Schema name=\"Sales Analytics\">\n <Cube name=\"Sales\">...</Cube>\n</Schema>", "validation": { "valid": true, "cubeCount": 1, "warnings": [] }, "summary": { /* condensed proposal — measures + dim names + join graph */ }, "detectorFindings": { /* heuristic insights about date hierarchies, etc */ }, "usage": { "model": "claude-sonnet-4-6", "inputTokens": 4321, "outputTokens": 892, "totalTokens": 5213, "estimatedCostUsd": 0.018, "isPricingExact": true }, "quota": { /* monthly LLM budget snapshot */ }}Key thing to notice: the response already includes the rendered
mondrianXml for the happy path. You don’t need to call
/render separately unless you’ve edited the proposal first.
The traceId lets you retrieve the full LLM conversation later
via GET /me/inference/trace/{traceId}
— useful for auditing or debugging unexpected proposals.
Failure modes:
400 invalid_proposal— Claude returned a structurally invalid proposal that our validator rejected. Trace ID still returned.429 over_budget— your tenant has exceeded its monthly LLM budget. Comes with aRetry-Afterheader and areset_atfield showing when the budget resets.502 upstream_error— Claude returned a network error or refused the request.
POST /me/inference/render
Convert a SchemaProposal into Mondrian schema XML.
Request body — the proposal JSON directly (not wrapped in
{proposal: …}). Use the proposal field from a prior /propose
response, with any local edits applied:
{ "schemaName": "Sales Analytics", "cubes": [ { "name": "Sales", "factTableSchema": "public", "factTableName": "fact_sales", "measures": [ /* … */ ], "dimensions": [ /* … */ ] } ]}Response (200):
{ "mondrianXml": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Schema name=\"Sales Analytics\">\n <Cube name=\"Sales\">...</Cube>\n</Schema>", "validation": { "valid": true, "cubeCount": 1, "warnings": [] }, "summary": { /* condensed view — measures + dim names + join graph */ }}Pure transformation — no LLM call, no warehouse interaction. Free.
You can call render multiple times on the same proposal as you edit it — that’s exactly what the Schema designer does in the dashboard’s edit loop.
Failure modes:
400 invalid_proposal— the proposal doesn’t satisfy the Mondrian XML constraints (missing required field, contradictory joins, etc). Response body:{ error, kind, message }.400 empty_proposal— request body was empty.
POST /me/inference/try-query
Run a sample MDX query against a draft cube before saving it. Useful for confirming the joins land where you expect.
Request body:
{ "proposal": { /* CubeProposal */ }, "connectionId": "uuid-of-saved-connection", "mdx": "SELECT { [Measures].[Revenue] } ON COLUMNS, { [Customer].[Name].MEMBERS } ON ROWS FROM [Sales]"}Response (200):
{ "columns": ["Customer", "Revenue"], "rows": [ ["Acme Corp", "1234.56"], ["Beta Inc", "987.65"] ], "executionMillis": 234}The cube is compiled in-memory; nothing is saved. Use this in your own iteration loop the way the dashboard’s Schema designer does.
GET /me/inference/trace/{traceId}
Retrieve the LLM conversation for a previous propose call. Each
propose response includes a traceId; pass it here to get the
full prompt + response.
Response (200):
{ "traceId": "ac2ab729-7fc3-4c3a-8e36-7cd77be87267", "createdAt": "2026-05-23T18:00:00Z", "status": "success", "model": "claude-sonnet-4-6", "inputTokens": 4321, "outputTokens": 892, "estimatedCostUsd": 0.018, "promptText": "...full system + user prompt (often 20+ KB)...", "responseText": "...full LLM response, parsed and unparsed..."}Useful for:
- Debugging unexpected proposals. See exactly what we sent Claude and what came back.
- Audit trail. Compliance teams sometimes want a record of what AI generated for human review.
- Iteration. Compare two consecutive proposals to see what Claude did differently.
Traces are retained for 30 days, then purged. RLS-scoped to your tenant — you can only retrieve your own traces.
End-to-end example
A complete script — profile a warehouse, propose a cube, render the XML, save it:
KEY="$SAIKU_API_KEY"CONN_ID="abc-123-def"
# 1. Profile (extract the `profile` field — propose wants the# profile object, not the whole response).PROFILE=$(curl -sS https://api.saiku.bi/me/inference/profile/connection/$CONN_ID \ -X POST -H "Authorization: Bearer $KEY" -d '{}' \ -H "Content-Type: application/json" | jq '.profile')
# 2. ProposePROPOSAL=$(curl -sS https://api.saiku.bi/me/inference/propose \ -X POST -H "Authorization: Bearer $KEY" \ -H "Content-Type: application/json" \ -d "{\"profile\": $PROFILE, \"intent\": \"Sales revenue and order count by customer and date.\"}")
# Happy path: the propose response already includes mondrianXml.echo "$PROPOSAL" | jq -r '.mondrianXml' > sales-cube.xml
# 3. Re-render (only needed if you've edited the proposal).# Important: the render endpoint takes the proposal JSON# directly — NOT wrapped in {proposal: ...}.XML=$(echo "$PROPOSAL" | jq '.proposal' | \ curl -sS https://api.saiku.bi/me/inference/render \ -X POST -H "Authorization: Bearer $KEY" \ -H "Content-Type: application/json" \ -d @- | jq -r '.mondrianXml')
# 4. Save (via /me/schemas — not covered on this page yet)echo "$XML" > sales-cube.xmlReplace step 4 with whatever your workflow needs — save to your own repo, commit to git, hand to a human reviewer.
Where to go next
- Schema designer — the dashboard UI built on this API.
- Authentication — Bearer tokens, rate limits.
- MCP server — for LLM agents that want to query your cubes (this page is about authoring them).