Platform-Native Security
User context execution, CRUD/FLS enforcement, sharing rules, field-level access control with type coercion.
AI Agent Studio implements defense-in-depth security with multiple layers of protection for AI agent operations on Salesforce.
Platform-Native Security
User context execution, CRUD/FLS enforcement, sharing rules, field-level access control with type coercion.
Trust Layers
PII masking before LLM calls, prompt injection detection, tool dependency validation, declarative sequencing constraints.
Human-in-the-Loop
Configurable approval workflows with Confirmation, Approval, and hybrid modes. Atomic state tracking with PendingHITLAction__c.
Audit & Observability
Complete execution traces, tool rationale capture, decision step logging, token tracking, cost analytics.
No Privilege Escalation: Agents always run in the context of the user who initiated the execution (OriginalUserId__c).
Sharing Mode: All classes use with sharing or inherited sharing to respect Salesforce sharing rules.
Record-Level Access: Users can only interact with records they have access to through org-wide defaults, sharing rules, and manual shares.
Automatic Enforcement: All SOQL queries use WITH USER_MODE to enforce object and field-level security.
// Framework pattern - always enforces securityList<Account> accounts = [ SELECT Id, Name, Industry FROM Account WHERE Id IN :accountIds WITH USER_MODE // Enforces CRUD + FLS];DML Security: All DML operations use Security.stripInaccessible() to remove inaccessible fields.
// Framework pattern for DMLSObjectAccessDecision decision = Security.stripInaccessible( AccessType.CREATABLE, recordsToInsert);insert decision.getRecords();Type Coercion with FLS: TypeCoercionService.coerceArgumentTypesForSObject() validates field access when converting LLM-provided arguments to SObject field values.
Object Permissions: Utils.checkObjectPermission() validates CRUD access before operations.
// Validate read access before queryingUtils.checkObjectPermission( Account.SObjectType, AccessType.READABLE);Field Accessibility: Framework respects field-level security when building SOQL queries and processing DML operations.
Prevents sensitive data from reaching LLM providers in raw form.
Architecture: PIIMaskingService orchestrates SchemaBasedMasker (Salesforce Data Classification) and PIIPatternMatcher (regex patterns).
How It Works:
[SSN:001])Configuration:
Per-agent via AIAgentDefinition__c:
PIIMaskingMode__c: Hybrid (both) / Schema-Only / Pattern-OnlySensitiveClassifications__c: Which Salesforce Data Classifications to mask (PII, Sensitive, Confidential, etc.)PIIPatternCategories__c: Which regex pattern categories to enableOrg-level via AIAgentFrameworkSettings__c.EnablePIIMasking__c
Pattern Coverage:
###-##-#### format with validationKey Features:
Example:
User: "Update case for customer SSN 123-45-6789"Masked: "Update case for customer SSN [SSN:001]"→ LLM processes masked version→ Tool execution receives unmasked value→ Response shown to user with original valuesProtects against prompt injection and instruction override attacks using three detection layers.
Architecture: PromptSafetyService orchestrates three analyzers. Pattern-Based Detection (JailbreakPatternMatcher) uses regex patterns from JailbreakPattern__mdt custom metadata to detect known attack signatures (DAN, jailbreak keywords, ignore instructions) with fast, deterministic detection. Heuristic Analysis (PromptHeuristicAnalyzer) detects instruction override (“ignore previous instructions”), role manipulation attempts (“you are now in developer mode”), delimiter injection (attempt to close/open prompt delimiters), and conversation reset attempts. Structural Analysis (PromptStructureAnalyzer) provides encoding detection (base64, hex, unicode escapes), N-gram similarity to known jailbreak patterns, and suspicious structure patterns.
Threat Scoring: Each analyzer returns score 0.0-1.0, combined into aggregate threat assessment. NONE 0.0-0.2 (safe), LOW 0.2-0.4 (minimal concern), MEDIUM 0.4-0.6 (suspicious), HIGH 0.6-0.8 (likely attack), CRITICAL 0.8-1.0 (definite attack).
Response Modes (per-agent configurable via PromptSafetyMode__c): Block rejects request entirely with safe error message where user sees generic denial and execution stops. Sanitize removes detected threats and continues with cleaned input where replacements are marked as [REMOVED:<category>] with recorded sanitized spans. Flag marks for review in audit logs and continues execution while creating AgentDecisionStep__c with threat details. Log Only records threat assessment and takes no action (for monitoring in non-production).
Configuration: Per-agent via AIAgentDefinition__c fields PromptSafetyMode__c (Block/Sanitize/Flag/LogOnly), SafetyThreshold__c (Threat score threshold 0.0-1.0 to trigger response), and SafetyPatternCategories__c (Which jailbreak categories to enable). Org-level via AIAgentFrameworkSettings__c.EnablePromptSafety__c.
Optimizations: Message-level caching (same message not re-analyzed within execution), early exit on high-severity pattern matches, and evaluation caps to prevent CPU time spikes.
Detection Categories: Role manipulation (“you are now”, “pretend to be”), instruction override (“ignore previous”, “forget your instructions”), delimiter injection (closing system prompt, opening new context), encoding attacks (base64, hex, unicode obfuscation), prompt leaking (“repeat your instructions”), and context manipulation (“this is a simulation”, “hypothetically”).
Prevents workflow hallucinations where LLM calls tools in illogical order.
Problem: Without constraints, LLM might call send_email before create_record, or update_record before get_record_details.
Solution: Shadow Graph Pattern - LLM generates dependency graph, admin approves, system enforces at runtime.
How It Works:
ToolDependencyGraphService uses LLM to analyze agent capabilities and suggest dependency graphToolDependencyGraphEditorController UIAIAgentDefinition__c.ToolDependencyGraph__c as JSONToolDependencyValidator checks dependencies before tool executionDependency Logic:
{ "version": "1.0", "dependencies": { "update_record": { "allOf": ["get_record_details"] }, "send_email": { "allOf": ["update_record"], "anyOf": ["get_email_address", "get_contact_info"] } }}allOf: ALL tools must be executed first (AND logic)anyOf: AT LEAST ONE tool must be executed first (OR logic)send_email requires update_record AND (at least one of get_email_address OR get_contact_info)Two-Phase Validation:
Pre-Flight Validation (before executing any tools in batch):
Runtime Validation (during execution loop):
Circuit Breaker: ToolCallResponseHandler tracks total dependency violations across execution. If threshold exceeded (default 10, configurable via AIAgentFrameworkSettings__c.MaxDependencyViolations__c), fails execution immediately to prevent infinite loops.
LLM Guidance on Violation: When tool blocked, system provides structured error message explaining required dependencies and next action.
Configuration: Enable via AIAgentDefinition__c.EnableDependencyValidation__c
Limitations: Only enforces synchronous tools in same batch. Async tools (separate jobs) cannot have dependencies enforced. Scope is turn-scoped for Conversational/Email agents (reset each turn) and execution-scoped for Function/Workflow agents.
Configurable approval requirements for sensitive actions via AgentCapability__c.HITLMode__c.
Modes: Disabled means no HITL and action executes immediately. Confirmation has LLM ask user for confirmation in chat before executing. Approval uses formal approval process via PendingHITLAction__c with notification. ConfirmationThenApproval requires both confirmation AND formal approval.
Notification Preferences (HITLNotificationPreference__c): Always Notify sends notifications for approvals, rejections, and errors (default). Notify on Rejection Only only sends notifications when actions are rejected.
Object: PendingHITLAction__c tracks approval state with atomic locking.
Lifecycle: Action requires approval → Create PendingHITLAction__c record → Set ExecutionStatus__c to ‘Awaiting Action’ → Notify approver (if configured) → Approver reviews and approves/rejects → On approval: Execute action and update execution → On rejection: Log rejection and mark execution failed/cancelled.
Security: Approvers must have access to source record and capability to approve.
ExecutionStep__c: Detailed execution log capturing:
EnableToolReasoning__c)AgentDecisionStep__c: User-friendly decision timeline for storyboard UI:
When AIAgentDefinition__c.EnableToolReasoning__c is enabled:
_rationale parameter to all tools in LLM schemaLLMFormattingService.extractAndStripRationale() extracts rationale, removes from argumentsExecutionStep__c.ToolRationale__c and AgentDecisionStep__c.ToolRationale__cagentStoryboardStep component for user visibilityBenefits:
Per-Step Tracking: ExecutionStep__c captures:
PromptTokens__c: Input tokens consumedCompletionTokens__c: Output tokens generatedTotalTokens__c: Sum of prompt + completionEstimatedCostUSD__c: Calculated cost based on model pricingAggregation: Build dashboards to track:
Start in Sandbox
Deploy to sandbox first with representative data. Test with various user profiles to validate CRUD/FLS enforcement.
Principle of Least Privilege
Create dedicated integration users with minimal permissions needed. Don’t grant system admin to agent service users.
Enable Trust Layers Incrementally
Start with LogOnly mode for prompt safety and PII masking. Monitor detection rates, tune thresholds, then enable Block/Sanitize modes.
Route Sensitive Actions Through Approvals
Use HITL Approval mode for data deletion, external integrations, financial transactions, and high-impact operations.
Monitor Execution Anomalies
Build dashboards on ExecutionStep__c and AgentDecisionStep__c. Alert on:
Review Tool Dependencies
Use ToolDependencyGraphService to generate initial graph, but have domain experts review and refine before production.
Regular Audit Reviews
Schedule periodic reviews of:
Before deploying agents to production: