docs · configure · settings
Guardrails (tenant)
feature
Input and output safety filters applied to every assistant in the workspace.
Tenant guardrails are the input and output safety filters that run on every assistant in your workspace. They apply at the tenant level, so a change here takes effect across all assistants immediately — there is no opt-out for an individual assistant from this screen.
Permission required — and superadmin-managed — This screen is gated on setting:update, held by the Tenant Administrator role. The console renders the form to everyone, but saving without the grant returns a 403. More importantly, the tenant-wide guardrail config is editable by VeriCite superadmins only: even with setting:update, tenant admins see a read-only platform-managed notice in place of the editable controls. Treat this page as a window into the active policy, not a free editing surface.
What the filters cover
Guardrails are organized into two families. Input filters inspect the user's message before it reaches generation; output filters inspect the assistant's draft answer before it is returned.
Input · piiinput filter
Detects personal data in the incoming message.
Input · injectioninput filter
Catches prompt-injection attempts aimed at overriding the assistant's instructions.
Input · toxicityinput filter
Flags abusive or harmful user input.
Output · hallucinationoutput filter
Checks the draft answer for claims unsupported by the cited sources.
Output · piioutput filter
Detects personal data leaking into the generated answer.
Tenant level has fewer filters than per-assistant — The per-assistant Guardrails panel is a separate surface and adds an output tool_use filter on top of these. That filter does not exist at the tenant level — the tenant config carries only the five families above.
Sensitivity, not raw thresholds
You do not tune confidence numbers or threshold sliders here. Each filter family is set by a single sensitivity level, and VeriCite maps that level to the underlying action and threshold for you.
lenientsensitivity
Lowest enforcement — only high-confidence matches act.
moderatesensitivity
Balanced default for most workspaces.
strictsensitivity
Highest enforcement — acts on lower-confidence matches.
The tenant enum differs from the per-assistant one — The tenant level uses lenient / moderate / strict. The per-assistant panel uses low / moderate / strict, where low quietly falls back to moderate. The two enums are not interchangeable — lenient is a real tenant-only level, and there is no low at this layer.
screenshot — provided as a component
Common pitfalls
- minimum_confidence is a reserved no-op. The knob is wired into the schema but is not yet enforced by the classifier — setting it changes nothing about which inputs or outputs are blocked. Do not rely on it.
- There is no per-assistant override at this level. Loosening or tightening a family here moves it for every assistant at once. If you need assistant-specific behavior, that lives on the separate per-assistant Guardrails panel, not here.
- Tenant admins cannot edit this page even with setting:update. The config is superadmin-owned; the read-only platform-managed notice is expected, not a bug. Request a change through your VeriCite contact.
This is the one page published both ways — Guardrails is the only article in this set surfaced as both public documentation and in-console help, and its shipped module carries superadmin: true — a signal that the underlying policy is managed above the tenant.
Related controls
featureScope Gate0.0–1.0; reserved knob — sets the classifier confidence floor.
featureFallback policyWhat the assistant does when no approved source supports an answer.
featureGuardrailsPer-assistant guardrails. Sensitivity is low/moderate/strict (low falls back to moderate); the output set adds tool_use.
featurePII detectionDetect and redact personal data in inputs and answers.
conceptEscalation by defaultWhen sources are silent, VeriCite escalates to a human instead of guessing.troubleshootingAnswers are being blockedDiagnose why an assistant refuses or escalates more than expected.Was this page helpful?