AI inference API

The AI inference API powers the Schema designer in the dashboard. It exposes the same three-step flow — profile, propose, render — as endpoints you can call from your own code. Useful when you want to automate cube generation across many warehouses, embed cube authoring into your own product, or integrate Saiku Cloud into an upstream onboarding flow.

All endpoints require a Bearer API key. See Authentication for the basics.

The flow

The dashboard’s Schema designer is the canonical reference implementation:

Profile a connection or an uploaded file. We sample its structure cheaply and return a SchemaProfile.
Propose a cube. We send the profile + an optional plain- English intent to Claude, and return a structured CubeProposal.
Render the proposal as Mondrian schema XML. The result is a string ready to save as a schema in your workspace.

You can call the steps independently — profile once and propose several times with different intents, or skip the propose step and hand-roll a CubeProposal for the renderer.

`POST /me/inference/profile/connection/{id}`

Profile a saved warehouse connection. Reads information_schema, samples a few rows per column, returns a structured profile of the warehouse’s tables and columns.

Path parameters:

id — the connection ID from GET /me/connections.

Optional query parameters:

schema — restrict to a specific database schema (e.g. public, analytics). Defaults to whatever schema the connection was saved against.
maxTables — cap the number of tables profiled. Default 20.
tableTypes — comma-separated list (TABLE, VIEW, MATERIALIZED VIEW). Defaults to TABLE only.

Response (200):

{
  "databaseProductName": "PostgreSQL",
  "databaseProductVersion": "16.6",
  "tables": [
    {
      "schema": "public",
      "name": "fact_sales",
      "rowCount": 1245678,
      "columns": [
        {
          "name": "sale_id",
          "type": "BIGINT",
          "nullable": false,
          "distinctApprox": 1245678,
          "sampleValues": [1, 2, 3, 4, 5]
        },
        {
          "name": "customer_id",
          "type": "BIGINT",
          "nullable": false,
          "distinctApprox": 25000,
          "sampleValues": [101, 102, 103, 104, 105]
        }
      ]
    }
  ],
  "sampledAt": "2026-05-23T18:00:00Z",
  "sampleDurationMillis": 1832,
  "sampleCostUsd": 0.001
}

Cost: typically under $0.05 per profile against a Postgres warehouse. Most of the cost is metadata queries, not data scans.

Failure modes:

404 not_found — connection ID doesn’t exist or isn’t visible to your tenant.
502 warehouse_unreachable — we couldn’t connect to the warehouse (credentials wrong, host down, network issue).

`GET /me/inference/profile/connection/{id}/sample`

Fetch a few sample rows from one table. Same auth and reachability shape as the profile endpoint, narrower scope.

Path + query parameters:

id — connection ID.
table — table name (required).
schema — database schema (defaults to the connection’s default).
rows — number of rows (default 5, max 50).

Response (200):

{
  "schema": "public",
  "table": "fact_sales",
  "columns": ["sale_id", "customer_id", "amount", "sale_date"],
  "rows": [
    [1, 101, "29.99", "2024-01-15"],
    [2, 102, "149.00", "2024-01-15"]
  ]
}

Useful for showing the user what their data actually looks like before they commit to a cube design.

`POST /me/inference/profile/file`

Profile an uploaded file instead of a warehouse table. Same shape as the connection profiler, with the file ID standing in for the connection ID.

Multipart body:

file — the file part. .parquet, .csv, or .json.
tableTypes — defaults to TABLE (same shape as the connection profiler, allows future expansion).

Response (200): same SchemaProfile shape as the connection profiler. Files surface as a single tables[0] entry.

DuckDB’s httpfs reads the file column-by-column, so a multi-GB Parquet file profiles in seconds without us pulling the whole thing into memory.

`POST /me/inference/propose`

Send a profile to Claude and get a cube proposal back.

Request body:

{
  "profile": { /* SchemaProfile from the profile step */ },
  "intent": "Sales facts joined to customer and product dimensions, sum of revenue, count of orders.",
  "factTable": { "schema": "public", "name": "fact_sales" }
}

profile (required) — the JSON returned by either profile endpoint.
intent (optional but strongly recommended) — one-line plain- English description of what you want. Without it Claude has much less to work with.
factTable (optional) — pin a specific fact table. Without it Claude picks one based on naming heuristics and column shapes.

Response (200):

{
  "traceId": "ac2ab729-7fc3-4c3a-8e36-7cd77be87267",
  "proposal": {
    "schemaName": "Sales Analytics",
    "cubes": [
      {
        "name": "Sales",
        "factTableSchema": "public",
        "factTableName": "fact_sales",
        "measures": [
          { "name": "Revenue", "column": "amount", "aggregator": "sum" },
          { "name": "Orders",  "column": "sale_id", "aggregator": "count" }
        ],
        "dimensions": [
          {
            "name": "Customer",
            "foreignKey": "customer_id",
            "tableSchema": "public",
            "tableName": "dim_customer",
            "primaryKey": "customer_id",
            "levels": [
              { "name": "Name", "column": "customer_name", "type": "String", "uniqueMembers": false }
            ]
          }
        ]
      }
    ]
  },
  "mondrianXml": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Schema name=\"Sales Analytics\">\n  <Cube name=\"Sales\">...</Cube>\n</Schema>",
  "validation": { "valid": true, "cubeCount": 1, "warnings": [] },
  "summary": { /* condensed proposal — measures + dim names + join graph */ },
  "detectorFindings": { /* heuristic insights about date hierarchies, etc */ },
  "usage": {
    "model": "claude-sonnet-4-6",
    "inputTokens": 4321,
    "outputTokens": 892,
    "totalTokens": 5213,
    "estimatedCostUsd": 0.018,
    "isPricingExact": true
  },
  "quota": { /* monthly LLM budget snapshot */ }
}

Key thing to notice: the response already includes the rendered mondrianXml for the happy path. You don’t need to call /render separately unless you’ve edited the proposal first.

The traceId lets you retrieve the full LLM conversation later via GET /me/inference/trace/{traceId} — useful for auditing or debugging unexpected proposals.

Failure modes:

400 invalid_proposal — Claude returned a structurally invalid proposal that our validator rejected. Trace ID still returned.
429 over_budget — your tenant has exceeded its monthly LLM budget. Comes with a Retry-After header and a reset_at field showing when the budget resets.
502 upstream_error — Claude returned a network error or refused the request.

`POST /me/inference/render`

Convert a SchemaProposal into Mondrian schema XML.

Request body — the proposal JSON directly (not wrapped in {proposal: …}). Use the proposal field from a prior /propose response, with any local edits applied:

{
  "schemaName": "Sales Analytics",
  "cubes": [
    {
      "name": "Sales",
      "factTableSchema": "public",
      "factTableName": "fact_sales",
      "measures": [ /* … */ ],
      "dimensions": [ /* … */ ]
    }
  ]
}

Response (200):

{
  "mondrianXml": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Schema name=\"Sales Analytics\">\n  <Cube name=\"Sales\">...</Cube>\n</Schema>",
  "validation": { "valid": true, "cubeCount": 1, "warnings": [] },
  "summary": { /* condensed view — measures + dim names + join graph */ }
}

Pure transformation — no LLM call, no warehouse interaction. Free.

You can call render multiple times on the same proposal as you edit it — that’s exactly what the Schema designer does in the dashboard’s edit loop.

Failure modes:

400 invalid_proposal — the proposal doesn’t satisfy the Mondrian XML constraints (missing required field, contradictory joins, etc). Response body: { error, kind, message }.
400 empty_proposal — request body was empty.

`POST /me/inference/try-query`

Run a sample MDX query against a draft cube before saving it. Useful for confirming the joins land where you expect.

Request body:

{
  "proposal": { /* CubeProposal */ },
  "connectionId": "uuid-of-saved-connection",
  "mdx": "SELECT { [Measures].[Revenue] } ON COLUMNS, { [Customer].[Name].MEMBERS } ON ROWS FROM [Sales]"
}

Response (200):

{
  "columns": ["Customer", "Revenue"],
  "rows": [
    ["Acme Corp", "1234.56"],
    ["Beta Inc",  "987.65"]
  ],
  "executionMillis": 234
}

The cube is compiled in-memory; nothing is saved. Use this in your own iteration loop the way the dashboard’s Schema designer does.

`GET /me/inference/trace/{traceId}`

Retrieve the LLM conversation for a previous propose call. Each propose response includes a traceId; pass it here to get the full prompt + response.

Response (200):

{
  "traceId": "ac2ab729-7fc3-4c3a-8e36-7cd77be87267",
  "createdAt": "2026-05-23T18:00:00Z",
  "status": "success",
  "model": "claude-sonnet-4-6",
  "inputTokens": 4321,
  "outputTokens": 892,
  "estimatedCostUsd": 0.018,
  "promptText": "...full system + user prompt (often 20+ KB)...",
  "responseText": "...full LLM response, parsed and unparsed..."
}

Useful for:

Debugging unexpected proposals. See exactly what we sent Claude and what came back.
Audit trail. Compliance teams sometimes want a record of what AI generated for human review.
Iteration. Compare two consecutive proposals to see what Claude did differently.

Traces are retained for 30 days, then purged. RLS-scoped to your tenant — you can only retrieve your own traces.

End-to-end example

A complete script — profile a warehouse, propose a cube, render the XML, save it:

KEY="$SAIKU_API_KEY"
CONN_ID="abc-123-def"

# 1. Profile (extract the `profile` field — propose wants the
#    profile object, not the whole response).
PROFILE=$(curl -sS https://api.saiku.bi/me/inference/profile/connection/$CONN_ID \
  -X POST -H "Authorization: Bearer $KEY" -d '{}' \
  -H "Content-Type: application/json" | jq '.profile')

# 2. Propose
PROPOSAL=$(curl -sS https://api.saiku.bi/me/inference/propose \
  -X POST -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -d "{\"profile\": $PROFILE, \"intent\": \"Sales revenue and order count by customer and date.\"}")

# Happy path: the propose response already includes mondrianXml.
echo "$PROPOSAL" | jq -r '.mondrianXml' > sales-cube.xml

# 3. Re-render (only needed if you've edited the proposal).
#    Important: the render endpoint takes the proposal JSON
#    directly — NOT wrapped in {proposal: ...}.
XML=$(echo "$PROPOSAL" | jq '.proposal' | \
  curl -sS https://api.saiku.bi/me/inference/render \
  -X POST -H "Authorization: Bearer $KEY" \
  -H "Content-Type: application/json" \
  -d @- | jq -r '.mondrianXml')

# 4. Save (via /me/schemas — not covered on this page yet)
echo "$XML" > sales-cube.xml

Replace step 4 with whatever your workflow needs — save to your own repo, commit to git, hand to a human reviewer.

Where to go next

Schema designer — the dashboard UI built on this API.
Authentication — Bearer tokens, rate limits.
MCP server — for LLM agents that want to query your cubes (this page is about authoring them).
AI Ask — natural-language layer that translates plain English into the requests documented here.
Agent Skills — admin-authored workflows the ask endpoint discovers automatically.
Agent Spaces — personas that scope the ask surface: system prompt, cube allowlist, skill allowlist.