Skip to main content

Audit Configuration Guide

This guide covers configuration and setup of the PDaaS audit system for developers and system administrators.

Quick Start

Enable Auditing in Development

# .env file
AUDIT_ENABLED=true
AUDIT_SERVICE_NAME=api
AUDIT_ENVIRONMENT=development

# For local development, use LoggerAuditSink (no OpenSearch required)
# Events will be written to application logs

Enable Auditing in Production

# .env file
AUDIT_ENABLED=true
AUDIT_SERVICE_NAME=api
AUDIT_ENVIRONMENT=production

# OpenSearch connection
AUDIT_OPENSEARCH_HOST=opensearch.internal
AUDIT_OPENSEARCH_PORT=9200
AUDIT_OPENSEARCH_USE_SSL=true
AUDIT_OPENSEARCH_USERNAME=audit_writer
AUDIT_OPENSEARCH_PASSWORD=secure_password

# Performance tuning
AUDIT_BATCH_SIZE=100
AUDIT_FLUSH_INTERVAL_SECONDS=5.0

Configuration Reference

Core Settings

VariableTypeDefaultDescription
AUDIT_ENABLEDbooltrueEnable/disable all auditing
AUDIT_SERVICE_NAMEstr"api"Service identifier in audit events
AUDIT_ENVIRONMENTstr"development"Environment name (dev/staging/prod)
AUDIT_SERVICE_VERSIONstr"1.0.0"Service version in audit events

OpenSearch Connection

VariableTypeDefaultDescription
AUDIT_OPENSEARCH_HOSTstr"localhost"OpenSearch hostname
AUDIT_OPENSEARCH_PORTint9200OpenSearch port
AUDIT_OPENSEARCH_USE_SSLboolfalseUse SSL/TLS for connection
AUDIT_OPENSEARCH_USERNAMEstr"admin"Authentication username
AUDIT_OPENSEARCH_PASSWORDstr"admin"Authentication password
AUDIT_OPENSEARCH_VERIFY_CERTSbooltrueVerify SSL certificates

Performance Tuning

VariableTypeDefaultRangeDescription
AUDIT_BATCH_SIZEint1001-1000Events per batch write
AUDIT_FLUSH_INTERVAL_SECONDSfloat5.00.1-60.0Max time before flush
AUDIT_MAX_BODY_SIZEint102401024-1048576Max body size (bytes)
AUDIT_MAX_QUEUE_SIZEint10000100-100000Max events in queue

Path Exclusion

VariableTypeDefaultDescription
AUDIT_EXCLUDED_PATHSJSON array["/health", "/metrics"]Paths to exclude from auditing

Format:

AUDIT_EXCLUDED_PATHS='["/health","/metrics","/docs","/redoc","/openapi.json"]'

Wildcard Support:

AUDIT_EXCLUDED_PATHS='["/internal/*","/debug/*"]'

Index Settings

VariableTypeDefaultOptionsDescription
AUDIT_INDEX_PREFIXstr"audit"-Index name prefix
AUDIT_INDEX_ROTATIONstr"daily"daily, weekly, monthlyIndex rotation frequency

Retry Settings

VariableTypeDefaultDescription
AUDIT_RETRY_ATTEMPTSint3Number of retry attempts
AUDIT_RETRY_BACKOFF_SECONDSfloat1.0Initial backoff delay
AUDIT_RETRY_BACKOFF_MULTIPLIERfloat2.0Backoff multiplier

Application Integration

FastAPI Middleware Setup

Add audit middleware to your FastAPI application:

from fastapi import FastAPI
from backend.audit.middleware import AuditMiddleware
from backend.audit.config import get_audit_config

app = FastAPI()

# Add audit middleware (after tenant middleware)
app.add_middleware(
AuditMiddleware,
config=get_audit_config()
)

Middleware Order (Important):

# Correct order
app.add_middleware(CorrelationMiddleware) # 1. Add trace_id
app.add_middleware(TenantMiddleware) # 2. Extract tenant context
app.add_middleware(AuditMiddleware) # 3. Audit requests

Custom Configuration

Override global configuration for specific use cases:

from backend.audit.config import AuditConfig
from backend.audit.middleware import AuditMiddleware

# Create custom config
custom_config = AuditConfig(
service_name="worker",
environment="staging",
batch_size=200,
excluded_paths=["/health", "/metrics", "/internal/*"]
)

# Use custom config
app.add_middleware(AuditMiddleware, config=custom_config)

Programmatic Configuration

Configure audit settings programmatically:

from backend.audit.config import AuditConfig, set_audit_config

# Create configuration
config = AuditConfig(
enabled=True,
service_name="api",
environment="production",
opensearch_host="opensearch.internal",
opensearch_port=9200,
opensearch_use_ssl=True,
batch_size=100,
flush_interval_seconds=5.0
)

# Set as global configuration
set_audit_config(config)

Sink Configuration

Development: LoggerAuditSink

For local development without OpenSearch:

from backend.audit.sinks import LoggerAuditSink
from backend.audit.config import get_audit_config

config = get_audit_config()
sink = LoggerAuditSink(config)

# Events written to application logs
await sink.write_async(event)

Characteristics:

  • No external dependencies
  • Events written to structured logs
  • Always available (health_check returns True)
  • Automatic sanitization applied

Production: OpenSearchAuditSink

For production with OpenSearch:

from backend.audit.sinks import OpenSearchAuditSink
from backend.audit.config import AuditConfig

config = AuditConfig(
opensearch_host="opensearch.internal",
opensearch_port=9200,
opensearch_use_ssl=True,
batch_size=100,
flush_interval_seconds=5.0
)

sink = OpenSearchAuditSink(config)

# Events batched and written to OpenSearch
await sink.write_async(event)

# Force flush on shutdown
await sink.flush_async()
await sink.close_async()

Characteristics:

  • High performance batching
  • Automatic index creation
  • Retry logic with exponential backoff
  • Graceful degradation on failures

Hybrid: MultiAuditSink

For redundancy or migration:

from backend.audit.sinks import (
LoggerAuditSink,
OpenSearchAuditSink,
MultiAuditSink
)

# Create individual sinks
logger_sink = LoggerAuditSink(config)
opensearch_sink = OpenSearchAuditSink(config)

# Combine with MultiAuditSink
multi_sink = MultiAuditSink([logger_sink, opensearch_sink])

# Writes to both sinks in parallel
await multi_sink.write_async(event)

Use Cases:

  • Migration period (logs + OpenSearch)
  • Redundancy (multiple OpenSearch clusters)
  • Development (logs for debugging + OpenSearch for testing)

Manual Event Emission

Basic Usage

from backend.audit import emit

@app.post("/policies")
async def create_policy(policy: PolicyCreate, request: Request):
policy_id = await policy_service.create(policy)

# Emit audit event
await emit(
action="policy.create",
target=f"policy:{policy_id}",
metadata={"policy_name": policy.name},
request=request
)

return {"id": policy_id}

Without Request Context

For background jobs or system operations:

from backend.audit import emit
from backend.utils.actor import ActorInfo

async def background_task():
result = await perform_operation()

await emit(
action="system.operation",
target="system",
metadata={"result": result},
actor=ActorInfo(actor_type="system", actor_id="system"),
organization_id="system",
account_id="system"
)

Custom Actor

Override actor information:

from backend.utils.actor import ActorInfo

await emit(
action="external.sync",
target="external:api",
metadata={"records_synced": 100},
actor=ActorInfo(
actor_type="service_account",
actor_id="sa_integration"
),
request=request
)

Performance Optimization

Batching Strategy

Configure batching based on your traffic pattern:

Low Traffic (< 100 req/s):

AUDIT_BATCH_SIZE=50
AUDIT_FLUSH_INTERVAL_SECONDS=10.0

Medium Traffic (100-1000 req/s):

AUDIT_BATCH_SIZE=100
AUDIT_FLUSH_INTERVAL_SECONDS=5.0

High Traffic (> 1000 req/s):

AUDIT_BATCH_SIZE=500
AUDIT_FLUSH_INTERVAL_SECONDS=2.0

Memory Management

Control memory usage with queue limits:

# Conservative (low memory)
AUDIT_MAX_QUEUE_SIZE=5000

# Standard (balanced)
AUDIT_MAX_QUEUE_SIZE=10000

# Aggressive (high throughput)
AUDIT_MAX_QUEUE_SIZE=50000

Body Size Limits

Reduce storage and bandwidth:

# Minimal (1KB)
AUDIT_MAX_BODY_SIZE=1024

# Standard (10KB)
AUDIT_MAX_BODY_SIZE=10240

# Large (100KB)
AUDIT_MAX_BODY_SIZE=102400

Path Exclusion

Exclude high-traffic, low-value endpoints:

AUDIT_EXCLUDED_PATHS='[
"/health",
"/healthz",
"/metrics",
"/docs",
"/redoc",
"/openapi.json",
"/internal/*",
"/debug/*"
]'

Index Management

Index Rotation

Choose rotation based on volume:

Daily Rotation (High Volume):

AUDIT_INDEX_ROTATION=daily
# Index: audit-org123-acc456-api-2025-09-30

Weekly Rotation (Medium Volume):

AUDIT_INDEX_ROTATION=weekly
# Index: audit-org123-acc456-api-2025-W39

Monthly Rotation (Low Volume):

AUDIT_INDEX_ROTATION=monthly
# Index: audit-org123-acc456-api-2025-09

Index Lifecycle Management

Configure ILM policies in OpenSearch:

{
"policy": {
"description": "Audit log lifecycle policy",
"default_state": "hot",
"states": [
{
"name": "hot",
"actions": [],
"transitions": [{
"state_name": "warm",
"conditions": {"min_index_age": "30d"}
}]
},
{
"name": "warm",
"actions": [
{"replica_count": {"number_of_replicas": 1}}
],
"transitions": [{
"state_name": "cold",
"conditions": {"min_index_age": "60d"}
}]
},
{
"name": "cold",
"actions": [
{"replica_count": {"number_of_replicas": 0}},
{"read_only": {}}
],
"transitions": [{
"state_name": "delete",
"conditions": {"min_index_age": "90d"}
}]
},
{
"name": "delete",
"actions": [{"delete": {}}],
"transitions": []
}
]
}
}

Monitoring and Health Checks

Health Check Endpoint

Check audit system health:

from backend.audit.client import OpenSearchClientFactory

@app.get("/health/audit")
async def audit_health():
client = await OpenSearchClientFactory.get_client(config)
is_healthy = await client.health_check()

return {
"status": "healthy" if is_healthy else "unhealthy",
"opensearch_connected": is_healthy
}

Metrics

Monitor audit system metrics:

from prometheus_client import Counter, Histogram

# Events emitted
audit_events_total = Counter(
"audit_events_total",
"Total audit events emitted",
["organization", "action", "result"]
)

# Event processing time
audit_processing_duration = Histogram(
"audit_processing_duration_seconds",
"Audit event processing duration",
["sink"]
)

# Failures
audit_failures_total = Counter(
"audit_failures_total",
"Total audit failures",
["sink", "error_type"]
)

Security Best Practices

OpenSearch Authentication

Use strong authentication:

# Prefer IAM roles over username/password
AUDIT_OPENSEARCH_USERNAME=audit_writer
AUDIT_OPENSEARCH_PASSWORD=generate_strong_password_here

# Use SSL/TLS in production
AUDIT_OPENSEARCH_USE_SSL=true
AUDIT_OPENSEARCH_VERIFY_CERTS=true

Least Privilege

Create dedicated OpenSearch user with minimal permissions:

{
"cluster_permissions": ["cluster:monitor/health"],
"index_permissions": [{
"index_patterns": ["audit-*"],
"allowed_actions": ["write", "create_index"]
}]
}

Network Security

Restrict OpenSearch access:

# Use private network
AUDIT_OPENSEARCH_HOST=opensearch.internal.vpc

# Or use VPN/bastion
AUDIT_OPENSEARCH_HOST=10.0.1.100

Data Encryption

Enable encryption at rest and in transit:

# TLS for connections
AUDIT_OPENSEARCH_USE_SSL=true

# OpenSearch encryption at rest (configured in OpenSearch)
# - Enable node-to-node encryption
# - Enable encryption at rest
# - Use AWS KMS for key management

Troubleshooting

Debug Mode

Enable debug logging:

import logging
logging.getLogger("backend.audit").setLevel(logging.DEBUG)

Verify Configuration

Check loaded configuration:

from backend.audit.config import get_audit_config

config = get_audit_config()
print(config.model_dump_json(indent=2))

Test OpenSearch Connection

from backend.audit.client import OpenSearchClientFactory

client = await OpenSearchClientFactory.get_client(config)
is_healthy = await client.health_check()
print(f"OpenSearch healthy: {is_healthy}")

Verify Events Are Written

from backend.audit import emit
from backend.utils.actor import ActorInfo

# Emit test event
await emit(
action="test.event",
target="test",
metadata={"test": True},
actor=ActorInfo(actor_type="system", actor_id="test"),
organization_id="test_org",
account_id="test_acc"
)

# Check OpenSearch for event
# Index: audit-test_org-test_acc-{service}-{date}

Migration Guide

From Legacy Logging

Before (legacy logging):

logger.info(f"User {user_id} created policy {policy_id}")

After (audit module):

await emit(
action="policy.create",
target=f"policy:{policy_id}",
metadata={"created_by": user_id},
request=request
)

Gradual Rollout

  1. Enable in development - Test with local OpenSearch
  2. Enable in staging - Validate performance and storage
  3. Enable for subset of production - A/B test
  4. Full production rollout - Monitor metrics closely

Backward Compatibility

Keep legacy logging during migration:

# Dual logging during migration
logger.info(f"User {user_id} created policy {policy_id}")

await emit(
action="policy.create",
target=f"policy:{policy_id}",
request=request
)

Support

For configuration assistance:

  • Review the Audit Trails Overview documentation
  • Check the Troubleshooting section in the documentation
  • Contact your PDaaS administrator