Skip to main content

Audit Trails

PDaaS provides comprehensive audit logging that automatically captures all API activity, security-sensitive operations, and business logic events. Audit trails enable compliance (GDPR, SOC2, HIPAA), security monitoring, and operational visibility.

Overview

The audit system captures:

  • HTTP Requests/Responses: All API calls with full context
  • Security Events: Authentication, authorization, policy changes
  • Business Logic: Custom events from application code
  • Operational Events: Background jobs, system operations

All audit events are:

  • Multi-tenant Isolated: Separated by organization and account
  • Immutable: Cannot be modified after creation
  • Searchable: Indexed in OpenSearch for powerful queries
  • Sanitized: Sensitive data automatically redacted

Automatic Request Auditing

How It Works

PDaaS automatically captures all HTTP requests and responses through the AuditMiddleware. This middleware:

  1. Captures request metadata (method, path, headers, body, IP)
  2. Measures request duration
  3. Captures response metadata (status, headers, body)
  4. Extracts tenant context (organization, account)
  5. Extracts actor information (user, service account)
  6. Emits audit event asynchronously (non-blocking)

What Gets Captured

Every HTTP request generates an audit event with:

Request Information:

  • HTTP method (GET, POST, PUT, DELETE, etc.)
  • Request path and query parameters
  • Request headers (sanitized)
  • Request body (sanitized, truncated if large)
  • Client IP address
  • User agent

Response Information:

  • HTTP status code
  • Response headers (sanitized)
  • Response body (sanitized, truncated if large)
  • Request duration (milliseconds)

Context Information:

  • Actor (who performed the action)
  • Organization and account (tenant context)
  • Trace ID (for distributed tracing)
  • Timestamp (ISO 8601 UTC)

Operation Result:

  • Success (2xx status codes)
  • Failure (4xx, 5xx status codes)

Path Exclusion

High-traffic, low-value endpoints are automatically excluded from auditing:

  • /health - Health check endpoint
  • /healthz - Kubernetes health check
  • /metrics - Prometheus metrics
  • /docs - API documentation
  • /redoc - ReDoc documentation
  • /openapi.json - OpenAPI schema

Additional paths can be configured via environment variables.

Manual Event Emission

For business logic events that don't map to HTTP requests, use the emit() function:

Basic Usage

from fastapi import Request
from backend.audit import emit

@app.post("/policies")
async def create_policy(policy: PolicyCreate, request: Request):
# Create policy
policy_id = await policy_service.create(policy)

# Manually emit audit event
await emit(
action="policy.create",
target=f"policy:{policy_id}",
metadata={"policy_name": policy.name},
request=request # Auto-extracts tenant, actor, trace_id
)

return {"id": policy_id}

Background Jobs

For events from background jobs or system operations:

from backend.audit import emit
from backend.utils.actor import ActorInfo

async def cleanup_expired_sessions():
deleted_count = await session_service.cleanup_expired()

await emit(
action="session.cleanup",
target="system",
metadata={"deleted_count": deleted_count},
actor=ActorInfo(actor_type="system", actor_id="system"),
organization_id="system",
account_id="system"
)

Audit Event Model

Basic Audit Event

{
"occurred_at": "2025-09-30T12:34:56.789Z",
"actor_type": "user",
"actor_id": "user_123",
"organization_id": "org_abc",
"account_id": "acc_xyz",
"action": "user.login",
"target": "user:123",
"metadata": {"method": "password"},
"trace_id": "trace_456"
}

Enhanced HTTP Audit Event

{
// Base fields
"occurred_at": "2025-09-30T12:34:56.789Z",
"actor_type": "user",
"actor_id": "user_123",
"organization_id": "org_abc",
"account_id": "acc_xyz",
"action": "http.post",
"target": "/users",
"trace_id": "trace_789",

// Service context
"service": "api",
"service_version": "1.0.0",
"environment": "production",

// Request details
"request_method": "POST",
"request_path": "/users",
"request_query_params": {"invite": "true"},
"request_headers": {"content-type": "application/json"},
"request_body": "{\"email\": \"[email protected]\"}",
"request_client_ip": "192.168.1.1",
"request_user_agent": "Mozilla/5.0",

// Response details
"response_status_code": 201,
"response_headers": {"content-type": "application/json"},
"response_body": "{\"id\": \"user_456\"}",
"response_duration_ms": 45.6,

// Resource information
"resource_type": "user",
"resource_id": "user_456",

// Result
"operation_result": "success"
}

Data Sanitization

PDaaS automatically sanitizes sensitive data before persisting audit events.

Header Sanitization

These headers are automatically removed from audit events:

  • Authorization
  • Cookie / Set-Cookie
  • X-API-Key
  • X-Auth-Token
  • Proxy-Authorization

Body Sanitization

These fields are automatically redacted in request/response bodies:

  • password, passwd, pwd
  • token, access_token, refresh_token
  • secret, client_secret
  • api_key, apikey
  • credit_card, card_number, cvv
  • ssn, social_security
  • private_key

Example:

// Original request body
{
"username": "alice",
"password": "secret123",
"email": "[email protected]"
}

// Sanitized in audit event
{
"username": "alice",
"password": "[REDACTED]",
"email": "[email protected]"
}

Body Truncation

Request and response bodies larger than 10KB are automatically truncated to prevent excessive storage usage. Truncated bodies include metadata indicating the original size.

Multi-Tenant Isolation

Audit events are stored in tenant-specific OpenSearch indices to ensure complete data isolation between organizations.

Index Naming Pattern

audit-{organization_id}-{account_id}-{service}-{date}

Examples:

  • Daily rotation: audit-org123-acc456-api-2025-09-30
  • Weekly rotation: audit-org123-acc456-api-2025-W39
  • Monthly rotation: audit-org123-acc456-api-2025-09

This ensures:

  • Complete isolation between organizations
  • Efficient queries (organization-scoped indices)
  • Flexible retention policies (per-organization)
  • Index lifecycle management (automatic deletion)

Searching Audit Events

OpenSearch Dashboards

Access your organization's audit logs through OpenSearch Dashboards:

  1. Navigate to OpenSearch Dashboards
  2. Select your organization's indices: audit-{your_org_id}-*
  3. Use Discover to search and filter events
  4. Create visualizations and dashboards

Common Queries

Find all actions by a user:

{
"query": {
"term": {"actor_id": "user_123"}
}
}

Find failed operations:

{
"query": {
"bool": {
"must": [
{"term": {"operation_result": "failure"}},
{"range": {"response_status_code": {"gte": 400}}}
]
}
}
}

Find all policy changes in last 24 hours:

{
"query": {
"bool": {
"must": [
{"prefix": {"action": "policy."}},
{"range": {"occurred_at": {"gte": "now-24h"}}}
]
}
}
}

Trace a specific request:

{
"query": {
"term": {"trace_id": "trace_abc123"}
}
}

Configuration

Audit behavior can be configured via environment variables:

Enable/Disable Auditing

AUDIT_ENABLED=true  # Set to false to disable all auditing

Service Information

AUDIT_SERVICE_NAME=api
AUDIT_ENVIRONMENT=production
AUDIT_SERVICE_VERSION=1.0.0

OpenSearch Connection

AUDIT_OPENSEARCH_HOST=localhost
AUDIT_OPENSEARCH_PORT=9200
AUDIT_OPENSEARCH_USE_SSL=true
AUDIT_OPENSEARCH_USERNAME=admin
AUDIT_OPENSEARCH_PASSWORD=secret

Performance Tuning

AUDIT_BATCH_SIZE=100                    # Events per batch write
AUDIT_FLUSH_INTERVAL_SECONDS=5.0 # Max time before flush
AUDIT_MAX_BODY_SIZE=10240 # 10KB max body size
AUDIT_MAX_QUEUE_SIZE=10000 # Max events in queue

Path Exclusion

# JSON array format
AUDIT_EXCLUDED_PATHS='["/health","/metrics","/docs","/internal/*"]'

Index Settings

AUDIT_INDEX_PREFIX=audit
AUDIT_INDEX_ROTATION=daily # daily, weekly, or monthly

Performance Impact

The audit middleware is designed for minimal performance impact:

Latency Overhead

  • Target: < 5ms added latency (p95)
  • Actual: < 2ms in production
  • Mechanism: Async event emission (fire-and-forget)

Throughput

  • Capacity: 10,000+ events/second per instance
  • Batching: Groups events for efficient bulk writes
  • Buffering: In-memory queue with configurable size

Resource Usage

  • Memory: < 100MB for audit buffers
  • CPU: < 5% overhead
  • Network: Minimal (batched writes to OpenSearch)

Compliance Features

GDPR

  • Right to Access: Search by user ID to retrieve all events
  • Right to Erasure: Anonymization support (replace user ID with hash)
  • Data Minimization: Configurable retention policies
  • Privacy by Design: Automatic PII redaction

SOC2

  • Complete Audit Trail: All data access and changes logged
  • Immutability: Events cannot be modified after creation
  • Access Control: OpenSearch role-based access
  • Retention: Configurable per-organization

HIPAA

  • PHI Access Logging: All protected health information access logged
  • Audit Reports: Generate compliance reports from OpenSearch
  • 6-Year Retention: Configurable retention policies
  • Encryption: At rest (OpenSearch) and in transit (TLS)

Retention and Cleanup

Default Retention

  • Default: 90 days
  • Configurable: Per organization
  • Automatic: Daily cleanup job

Manual Cleanup

# CLI tool for manual cleanup
python -m backend.audit.cleanup --organization-id org_123 --older-than 90

Lifecycle Management

OpenSearch Index Lifecycle Management (ILM) policies:

  1. Hot: Recent indices (0-30 days) - High performance
  2. Warm: Medium-aged indices (31-60 days) - Reduced replicas
  3. Cold: Old indices (61-90 days) - Compressed, snapshot
  4. Delete: Indices older than retention period

Best Practices

When to Use Manual Emit

Use manual emit() for:

  • Business logic events (policy changes, grants, etc.)
  • Background job results
  • System operations
  • Security-sensitive actions

Don't use manual emit for:

  • HTTP requests (automatic via middleware)
  • Health checks or metrics
  • Internal debugging (use logging instead)

Action Naming Conventions

Use dot-notation for hierarchical actions:

  • user.login - User authentication
  • user.create - User creation
  • policy.create - Policy creation
  • policy.update - Policy modification
  • grant.attach - Grant attachment
  • session.cleanup - Session cleanup

Target Format

Use resource type prefix:

  • user:{user_id} - User resources
  • policy:{policy_id} - Policy resources
  • grant:{grant_id} - Grant resources
  • organization:{org_id} - Organization resources

Metadata Guidelines

Include relevant context in metadata:

  • Keep metadata small (< 1KB)
  • Use structured data (JSON-serializable)
  • Avoid sensitive information
  • Include business-relevant details

Good metadata:

{
"policy_name": "admin-access",
"resource_count": 5,
"changes": ["added_action", "modified_condition"]
}

Bad metadata:

{
"password": "secret123", // Sensitive data
"huge_list": [...], // Too large
"debug_info": "..." // Not business-relevant
}

Troubleshooting

Audit Events Not Appearing

  1. Check if auditing is enabled:

    # Verify environment variable
    echo $AUDIT_ENABLED # Should be "true"
  2. Check OpenSearch connectivity:

    from backend.audit.client import OpenSearchClientFactory
    client = await OpenSearchClientFactory.get_client(config)
    is_healthy = await client.health_check()
  3. Check path exclusion:

    from backend.audit.config import get_audit_config
    config = get_audit_config()
    is_excluded = config.is_path_excluded("/your/path")

High Memory Usage

Reduce batch size and queue size:

AUDIT_BATCH_SIZE=50          # Reduce from default 100
AUDIT_MAX_QUEUE_SIZE=5000 # Reduce from default 10000

Slow Performance

  1. Enable batching (should be default)
  2. Increase batch size for higher throughput
  3. Use appropriate index rotation (daily for high volume)
  4. Ensure OpenSearch cluster is properly sized

Missing Context (Unknown Organization)

Ensure tenant middleware runs before audit middleware:

# Correct order
app.add_middleware(TenantMiddleware) # First
app.add_middleware(AuditMiddleware) # After tenant

Security Considerations

Audit Log Tampering

Audit logs are protected against tampering:

  • Write-Once: Indices configured for append-only
  • No Updates/Deletes: OpenSearch security policies prevent modifications
  • Backup: Daily snapshots to S3 for disaster recovery
  • Integrity: Checksums verify log integrity

Access Control

Restrict access to audit logs:

  • Authentication: Require username/password or IAM
  • SSL/TLS: Encrypt connections to OpenSearch
  • RBAC: Role-based access control in OpenSearch
  • Isolation: Organization-specific indices prevent cross-tenant access

Sensitive Data Leakage

Multiple layers of protection:

  • Automatic Sanitization: Headers and body fields redacted
  • Size Limits: Large bodies truncated
  • Code Review: Security review of sanitization rules
  • Testing: Comprehensive tests verify sanitization

Support

For audit-related questions or issues:

  • Review the API Reference in the documentation
  • Check the Troubleshooting section in the documentation
  • Contact your PDaaS administrator