Monitoring & observability
K-AI exports operational signals in standard formats so you can integrate them with your existing observability stack. This page covers what is emitted, how to consume it, and what you can observe self-service.
What K-AI emits
Per-pod metrics — every K-AI service exposes a Prometheus-compatible scrape endpoint covering CPU, memory, request latency, error rate, and queue depth.
Queue depth — the LLM service and the indexation pipeline expose pending counts and oldest-pending-request age as first-class signals.
Cost events — every billable action is emitted as a cost event aggregated into KCU (K-AI Consumption Units). See Cost events.
Auth events — login attempts, token issuance, refresh-token rotation, revocation, Dynamic Client Registration, Microsoft SSO callbacks.
MCP invocations — every MCP tool call is logged with tool name, target K-AI Instance, calling user, latency, and outcome.
All services log structured JSON to stdout. No extra configuration is required to ship logs to your collector.
How to consume it
K-AI does not bundle an observability stack — you point your existing tooling at the metrics and logs the platform emits.
Datadog, Grafana, Splunk, Elastic, etc. — scrape the Prometheus endpoints (standard
ServiceMonitorpattern on Kubernetes) and collect stdout logs via your usual agent.PagerDuty, Opsgenie, … — wire alerts off your collector. Useful starter alerts: sustained LLM-queue backlog, abnormal failed-indexation rate, login-failure spikes from a single IP, per-instance KCU consumption beyond your expected envelope.
SIEM — the auth-event stream is the highest-signal input. Ingest it as-is.
For on-premise deployments, the same metrics and logs are emitted inside your cluster — no new networking required.
Cost events (PICSOU)
PICSOU aggregates billable actions across the platform into KCU per K-AI Instance and per organisation. Cost dashboards live at picsou.kai-studio.ai for SaaS customers; the same dashboards ship with on-premise deployments.
Eight cost types are tracked:
LLM
LLM model calls (prompt + completion tokens)
FILEPARSER
Per-document parsing and extraction
SEARCH_INDEX
Vector-index footprint per day
INSTANCE_ACCESS
Per-instance availability
INDEXING_JOB
Bulk re-indexation batches
AUDIT_TASK
Audit AI crew runs
CRAWL_URL_TASK
Web-crawler activity
RETRIEVAL_TASK
MCP tool invocations
Cost events are scoped either to a specific K-AI Instance or to an organisation (for org-level activity such as the web crawler). RBAC distinguishes admin (full configuration and invoice access) from sales (consumption dashboards only). K-AI platform operators see all orgs; client users see only their own.
What you can observe self-service
Per-instance indexation status — call
GET /documents/list?state=…to see in-flight, indexed, or errored documents. See Instance API — Documents.Cost reports — PICSOU dashboards with drill-down by cost type, instance, and day.
OAuth lifecycle — the auth service exposes active-session listing and revocation endpoints. See Authentication — OAuth 2.1.
MCP tool calls — visible inside your MCP client (Claude Desktop, Cursor, Le Chat, etc.) for end-user debugging.
For incidents that go beyond self-service observability, contact your K-AI support representative.
Last updated