Team+

Answers from allowed chunks only. Never from denied ones.

Grounded Answers retrieves policy-filtered chunks, passes them to an LLM, and returns a cited answer — with no denied content ever entering the model context. Three outcome states: answered, no_access, insufficient_context.

Requires Team plan or above. Also requires a configured LLM API key (Organization Settings → LLM API Key) or the 100-call fallback for paid tiers.

How it works

POST /api/answers/execute takes a question, a connector_id, a principal_id, and a search_mode (vector/keyword/hybrid). Gateco runs the retrieval, applies all active policies, and builds an LLM prompt from the allowed chunks only. The synthesized answer is returned with citations — each citation includes the source resource ID, a short content preview, and the relevance score.

The endpoint returns one of three outcome states: answered (synthesis succeeded with at least one allowed chunk), no_access (all relevant chunks were denied by policy — the principal does not have access to this information), or insufficient_context (allowed chunks exist but the model abstained from answering because they did not contain enough information to form a confident response).

The LLM key model

Each organization can configure its own OpenAI API key in Organization Settings. When a key is configured, every Grounded Answers call uses that key. If no key is configured, paid-tier organizations (Team/Growth/Enterprise) have a 100-call lifetime fallback on Gateco's shared key. Free-tier organizations must configure their own key.

Keys are stored using AES-256-GCM envelope encryption with per-tenant KMS context binding — the same architecture as connector credentials. See the BYOK blog post for the full encryption model.

Citations and partial answers

Every answered response includes a citations array. Each citation carries the resource_id, a content_preview (the first 200 characters of the chunk), and the retrieval score. Partial answers are flagged when some relevant chunks were denied — the answer is based only on what the principal is allowed to see, and the response includes a is_partial flag and the count of denied chunks.

The audit trail records every Grounded Answers request as a retrieval event (for the policy-filtered retrieval leg) plus an answer event (for the synthesis step). Both events are linked by a shared request_id.

# Python SDK — Grounded Answers
result = client.answers.execute(
  connector_id="conn_abc",
  principal_id="user_123",
  question="What is our data retention policy?",
  search_mode="hybrid",
  top_k=8,
)
print(result.outcome)     # "answered"
print(result.answer)      # synthesized text
for c in result.citations:
    print(c.resource_id, c.score)

Frequently asked questions

What model does Gateco use for synthesis?

The default is gpt-4o-mini. Organizations on self-hosted deployments can override via the GATECO_ANSWER_MODEL environment variable. Additional LLM providers are on the roadmap.

Can I use Grounded Answers with grep mode?

No. Grep is excluded from answer synthesis. Exact-match results without semantic ranking are not useful LLM context. Use vector, keyword, or hybrid mode for Grounded Answers.

What happens when the 100-call fallback is exhausted?

The endpoint returns 422 with error code LLM_CREDIT_EXHAUSTED. The retrieval step still runs and returns policy-filtered chunks — only the synthesis step is blocked. Configure your own OpenAI key in Organization Settings to resume synthesis.