Architecture

How Gateco fits in your stack

Gateco sits between your AI application and your vector databases. It enforces policies, syncs identity, and logs every decision, without changing your vector DB or ingestion pipeline.

Multi-tenant SaaS: Standard RAG Security Deployment

The standard deployment. Gateco-hosted policy engine sits in the retrieval path, enforcing access before chunks reach your AI.

Your AI application sends each query with a principal ID, resolved from your session or JWT.
Gateco resolves the principal's attributes from your IDP (roles, groups, department, clearance).
Policies evaluate against both the principal and each returned chunk's metadata, deny-by-default.
Every decision (allowed or denied) is written to the audit log before results are returned.
Your vector DB schema and ingestion pipeline are unchanged. Gateco only touches the read path.

Enterprise: Private Data Plane for AI Access Control

For enterprises requiring that connector credentials never leave their network. Gateco's policy engine runs inside your VPC. Waitlist open for Q3 2026.

Policy engine deployed as a container in your VPC. No vector DB credentials leave your network.
Audit logs remain in your customer-controlled storage (S3, GCS, or Azure Blob).
IDP sync uses outbound TLS from Gateco to your identity provider, with no inbound network openings required.
SIEM streaming connects from your audit log storage to your existing CSPM pipeline.

Multi-region: EU Data Residency

For organizations with EU AI Act or GDPR data residency requirements. EU tenant data, including policy evaluation and audit logs, stays in the EU region.

US and EU Gateco instances are independently operated, with no cross-region retrieval traffic.
EU tenant audit logs stay in the EU region and are never replicated to the US instance.
Principal data from shared IDPs is synced per-region. Each instance maintains its own principal cache.
EU AI Act audit evidence is available as region-scoped export, never crossing the Atlantic.

Deployment model matrix

Choose the deployment that fits your security and data residency requirements.

Model	Description	Availability
SaaS Shared	Gateco-hosted, multi-tenant. Policy evaluation and audit logs in Gateco infrastructure.	All plans
SaaS Dedicated	Gateco-hosted, dedicated tenant namespace. Isolated compute and storage.	Enterprise
Private Data Plane	Gateco policy engine runs in your VPC. Your credentials never leave your network.	Enterprise (waitlist)
Self-Host	Full Gateco stack in your own infrastructure. No Gateco telemetry.	Q3 2026 waitlist

Recent improvements (May 2026)

Platform changes shipped between 2026-05-22 and 2026-05-27 that affect the retrieval and security architecture.

Azure AI Search + Vertex AI connectors

Three new connectors: Azure AI Search (native RRF hybrid), Vertex AI Vector Search (ANN-only, <100ms p95 at 1M vectors), and Vertex AI Search (hybrid unstructured-data search). All three support the standard 4-mode search API and late-binding policy enforcement.

KMS per-tenant credential binding

Every connector credential DEK is now wrapped with an EncryptionContext keyed to the organization ID. AWS KMS rejects cross-org decrypt requests at the key-management layer before Gateco code runs, not as an application-layer check.

Per-org LLM API key (BYOK)

Organizations can configure their own OpenAI API key for answer synthesis. Keys are stored with the same envelope encryption as connector credentials. The server-side fallback (100 lifetime calls on paid tiers) remains available as a bootstrap mechanism.

ReBAC: 1-hop relationship policies

A new relationships table enables resource-owner policies: condition {"field": "relation.owner_of", "operator": "eq", "value": true} resolves a direct Relationship row between the requesting principal and the retrieved resource. Results cached at 60s TTL.

Fail-closed as org default

New organizations now default to failure_mode=closed. A policy evaluation error produces a denial and a decision=error_deny audit event. No ambiguous access. Fail-open is Enterprise-only via a signed agreement.

Rate limiting on high-frequency paths

Per-org per-minute caps enforced in-memory: 60 retrievals/min, 20 answer synthesis calls/min, 10 Access Simulator calls/min. All three limits are backed by a Redis-optional design. Single-instance in-memory fallback requires no infrastructure changes.

Reference architectures

Practical patterns for putting permission-aware retrieval into a real RAG pipeline.

Discuss your deployment

Enterprise deployments (Private Data Plane, VPC, multi-region EU) are scoped individually. Talk to us about your requirements.

Talk to us Read the docs