> For the complete documentation index, see [llms.txt](https://k-ai.gitbook.io/knowledge-ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://k-ai.gitbook.io/knowledge-ai/operate/deployment-models.md).

# Deployment models

Operational depth on the three supported deployment modes. For the commercial overview, see [Platform — Deployment models](/knowledge-ai/the-k-ai-platform/deployment.md).

All three modes run the same K-AI Platform application code. What changes is the substrate: where Kubernetes runs, where the database lives, where documents are stored, and who owns operations.

## SaaS (multi-tenant)

K-AI-operated, hosted on managed Kubernetes in France.

| Resource         | Value                                                         |
| ---------------- | ------------------------------------------------------------- |
| Cluster          | Managed Kubernetes (France)                                   |
| Database         | Managed PostgreSQL with dedicated schemas per platform module |
| Search & vectors | Managed vector index                                          |
| File storage     | Managed object storage                                        |
| Ingress          | Standard ingress controller with automated TLS                |
| DNS              | `kai-studio.ai` and subdomains                                |
| LLM endpoint     | Routed to the K-AI LLM service                                |

### Per-customer isolation

One K-AI Instance = one dedicated pod + one database schema + one vector index + one storage bucket. Compute, vector store, and storage are physically separate per instance; noisy-neighbour effects are isolated at the pod level.

The shared management plane holds no document content — only metadata, identity, access rules, and cost events.

### Operational ownership

K-AI operates the cluster, runs upgrades, performs backups, and monitors the platform. The customer's only operational responsibility is connector credentials and access management.

## On-premise (Kubernetes in customer cluster)

Same chart, same images, same isolation model — deployed in a Kubernetes cluster the customer owns.

Distribution is the **K-AI Helm chart**, with an optional companion chart for customers without an external LLM endpoint. Both ship as offline bundles for air-gapped environments.

Customer-supplied dependencies:

* Kubernetes cluster (1.27+) with cluster-admin access during install
* OCI-compatible container registry
* Ingress controller and TLS (managed certificates or customer PKI)
* PostgreSQL (managed or self-hosted) and an S3-compatible object store
* IAM federation for SSO (Azure AD, Okta, Ping, or any OIDC-compliant IdP)
* Observability stack (Datadog, Grafana, Splunk, …) — K-AI exports Prometheus metrics; the customer scrapes

Hardware sizing scales with document estate size and indexation throughput; K-AI provides sizing guidance during onboarding.

### Air-gapped support

The on-premise mode is designed for environments without internet egress. K-AI produces an offline bundle on a connected machine; the customer transfers it (SFTP, disk, …) and loads it into their private registry. Subsequent upgrades ship as smaller delta bundles.

See [On-premise installation](/knowledge-ai/operate/on-premise.md) for the install walkthrough.

## Snowflake Native App (SPCS)

Deployed as Snowflake Container Services inside the customer's own Snowflake account. The platform runs inside Snowflake's container runtime — no external Kubernetes cluster involved.

| Resource             | Where it lives                                          |
| -------------------- | ------------------------------------------------------- |
| Application services | Snowflake Compute Pools (SPCS)                          |
| Documents            | Snowflake Stages                                        |
| Embeddings           | `VECTOR` columns in dedicated schemas                   |
| Management metadata  | Dedicated Snowflake schemas inside the customer account |
| Authentication       | Same auth service, deployed as a SPCS service           |

Data residency is defined by the customer's Snowflake region. Operations responsibility is shared: Snowflake operates the substrate, K-AI provides the application updates, the customer manages account-level governance.

The Snowflake mode inherits Snowflake's resource model and operational SLA. Scaling is driven by Snowflake compute pool sizing.

## LLM serving

K-AI's LLM serving runs on a separate, dedicated platform independent of the SaaS cluster and of any customer on-premise cluster. The serving topology — model selection, sizing, autoscaling, and fallback strategy — is operated by K-AI and out of scope for customer integration.

For on-premise deployments without local GPU capacity, route LLM requests to the K-AI SaaS LLM endpoint or to your own LLM endpoint. A companion chart is available for customers who want to self-host completion and embedding inside their cluster.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://k-ai.gitbook.io/knowledge-ai/operate/deployment-models.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
