For the complete documentation index, see llms.txt. This page is also available as Markdown.

Deployment models

Operational depth on the three supported deployment modes. For the commercial overview, see Platform — Deployment models.

All three modes run the same K-AI Platform application code. What changes is the substrate: where Kubernetes runs, where the database lives, where documents are stored, and who owns operations.

SaaS (multi-tenant)

K-AI-operated, hosted on managed Kubernetes in France.

Resource
Value

Cluster

Managed Kubernetes (France)

Database

Managed PostgreSQL with dedicated schemas per platform module

Search & vectors

Managed vector index

File storage

Managed object storage

Ingress

Standard ingress controller with automated TLS

DNS

kai-studio.ai and subdomains

LLM endpoint

Routed to the K-AI LLM service

Per-customer isolation

One K-AI Instance = one dedicated pod + one database schema + one vector index + one storage bucket. Compute, vector store, and storage are physically separate per instance; noisy-neighbour effects are isolated at the pod level.

The shared management plane holds no document content — only metadata, identity, access rules, and cost events.

Operational ownership

K-AI operates the cluster, runs upgrades, performs backups, and monitors the platform. The customer's only operational responsibility is connector credentials and access management.

On-premise (Kubernetes in customer cluster)

Same chart, same images, same isolation model — deployed in a Kubernetes cluster the customer owns.

Distribution is the K-AI Helm chart, with an optional companion chart for customers without an external LLM endpoint. Both ship as offline bundles for air-gapped environments.

Customer-supplied dependencies:

  • Kubernetes cluster (1.27+) with cluster-admin access during install

  • OCI-compatible container registry

  • Ingress controller and TLS (managed certificates or customer PKI)

  • PostgreSQL (managed or self-hosted) and an S3-compatible object store

  • IAM federation for SSO (Azure AD, Okta, Ping, or any OIDC-compliant IdP)

  • Observability stack (Datadog, Grafana, Splunk, …) — K-AI exports Prometheus metrics; the customer scrapes

Hardware sizing scales with document estate size and indexation throughput; K-AI provides sizing guidance during onboarding.

Air-gapped support

The on-premise mode is designed for environments without internet egress. K-AI produces an offline bundle on a connected machine; the customer transfers it (SFTP, disk, …) and loads it into their private registry. Subsequent upgrades ship as smaller delta bundles.

See On-premise installation for the install walkthrough.

Snowflake Native App (SPCS)

Deployed as Snowflake Container Services inside the customer's own Snowflake account. The platform runs inside Snowflake's container runtime — no external Kubernetes cluster involved.

Resource
Where it lives

Application services

Snowflake Compute Pools (SPCS)

Documents

Snowflake Stages

Embeddings

VECTOR columns in dedicated schemas

Management metadata

Dedicated Snowflake schemas inside the customer account

Authentication

Same auth service, deployed as a SPCS service

Data residency is defined by the customer's Snowflake region. Operations responsibility is shared: Snowflake operates the substrate, K-AI provides the application updates, the customer manages account-level governance.

The Snowflake mode inherits Snowflake's resource model and operational SLA. Scaling is driven by Snowflake compute pool sizing.

LLM serving

K-AI's LLM serving runs on a separate, dedicated platform independent of the SaaS cluster and of any customer on-premise cluster. The serving topology — model selection, sizing, autoscaling, and fallback strategy — is operated by K-AI and out of scope for customer integration.

For on-premise deployments without local GPU capacity, route LLM requests to the K-AI SaaS LLM endpoint or to your own LLM endpoint. A companion chart is available for customers who want to self-host completion and embedding inside their cluster.

Last updated