Skip to content

Production Deployment Guide

This document covers deploying Tinman in a production environment with full safety controls, observability, and operational tooling.

Prerequisites

  • Python 3.10+
  • PostgreSQL 13+ (for persistence)
  • Docker (optional, for containerized deployment)

Quick Production Setup

# Install with all dependencies
pip install AgentTinman[all]

# Set required environment variables
export ANTHROPIC_API_KEY=your-key
export TINMAN_DATABASE_URL=postgresql://user:pass@localhost/tinman
export TINMAN_MODE=shadow

# Initialize database
alembic upgrade head

# Start service
tinman serve --host 0.0.0.0 --port 8000

Database Setup

Alembic Migrations

Tinman uses Alembic for database migrations:

# Run all migrations
alembic upgrade head

# Check current version
alembic current

# Generate new migration
alembic revision --autogenerate -m "description"

# Rollback one version
alembic downgrade -1

Required Tables

The initial migration creates:

  • hypotheses - Generated research hypotheses
  • experiments - Experiment definitions and results
  • failures - Discovered failure modes
  • interventions - Proposed interventions
  • simulations - Simulation results
  • memory_nodes - Knowledge graph nodes
  • memory_edges - Knowledge graph relationships
  • audit_logs - Immutable audit trail
  • approval_decisions - Human approval records
  • mode_transitions - Mode change history
  • tool_executions - Tool call records

Service Mode

FastAPI Endpoints

Start the HTTP service:

tinman serve --host 0.0.0.0 --port 8000

Available endpoints:

Endpoint Method Description
/health GET Health check with component status
/ready GET Kubernetes readiness probe
/live GET Kubernetes liveness probe
/status GET Current Tinman state
/research/cycle POST Run a research cycle
/approvals/pending GET List pending approvals
/approvals/{id}/decide POST Decide on approval
/discuss POST Interactive discussion
/report GET Generate report
/mode GET Current mode
/mode/transition POST Transition modes
/metrics GET Prometheus metrics

Docker Deployment

FROM python:3.11-slim

WORKDIR /app
COPY . .
RUN pip install -e .[all]

ENV TINMAN_MODE=shadow
EXPOSE 8000

CMD ["uvicorn", "tinman.service.app:app", "--host", "0.0.0.0", "--port", "8000"]
# docker-compose.yml
version: '3.8'

services:
  tinman:
    build: .
    ports:
      - "8000:8000"
      - "9090:9090"  # Prometheus metrics
    environment:
      - TINMAN_MODE=shadow
      - TINMAN_DATABASE_URL=postgresql://postgres:postgres@db/tinman
      - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
    depends_on:
      - db

  db:
    image: postgres:15
    environment:
      - POSTGRES_PASSWORD=postgres
      - POSTGRES_DB=tinman
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: tinman
spec:
  replicas: 1
  selector:
    matchLabels:
      app: tinman
  template:
    metadata:
      labels:
        app: tinman
    spec:
      containers:
        - name: tinman
          image: tinman:latest
          ports:
            - containerPort: 8000
            - containerPort: 9090
          env:
            - name: TINMAN_MODE
              value: "shadow"
            - name: TINMAN_DATABASE_URL
              valueFrom:
                secretKeyRef:
                  name: tinman-secrets
                  key: database-url
          readinessProbe:
            httpGet:
              path: /ready
              port: 8000
            initialDelaySeconds: 10
          livenessProbe:
            httpGet:
              path: /live
              port: 8000
            initialDelaySeconds: 30
          resources:
            requests:
              memory: "512Mi"
              cpu: "500m"
            limits:
              memory: "2Gi"
              cpu: "2000m"

Cost Controls

Budget Configuration

from tinman.core.cost_tracker import CostTracker, BudgetConfig, BudgetPeriod

# Configure budget
config = BudgetConfig(
    limit_usd=100.0,           # Max spend
    period=BudgetPeriod.DAILY,  # Reset daily
    warn_threshold=0.8,         # Warn at 80%
    hard_limit=True,            # Block when exceeded
)

tracker = CostTracker(budget_config=config)

Budget Enforcement

from tinman.core.cost_tracker import BudgetExceededError

try:
    # Check before operation
    tracker.enforce_budget(estimated_cost=5.0)

    # Perform operation
    result = await expensive_operation()

    # Record actual cost
    tracker.record_cost(
        amount_usd=4.50,
        source="llm_call",
        model="claude-3-opus",
        operation="research",
    )
except BudgetExceededError as e:
    print(f"Budget exceeded: ${e.current:.2f} / ${e.limit:.2f}")

Cost Monitoring

# Get cost summary
summary = tracker.get_summary()
print(f"Current period: ${summary['current_period_cost_usd']:.2f}")
print(f"Remaining: ${summary['remaining_budget_usd']:.2f}")
print(f"By model: {summary['by_model']}")

Observability

Prometheus Metrics

Start the metrics server:

from tinman.core.metrics import start_metrics_server

start_metrics_server(port=9090)

Key metrics:

Metric Type Description
tinman_research_cycles_total Counter Total research cycles
tinman_failures_discovered_total Counter Failures by severity/class
tinman_approval_decisions_total Counter Approvals by decision/tier
tinman_cost_usd_total Counter Costs by source/model
tinman_llm_requests_total Counter LLM requests by model/status
tinman_llm_latency_seconds Histogram LLM request latency
tinman_tool_executions_total Counter Tool calls by status
tinman_pending_approvals Gauge Current pending approvals
tinman_current_mode Gauge Active operating mode

Grafana Dashboard

Example dashboard JSON available at examples/grafana-dashboard.json.

Structured Logging

from tinman.utils import get_logger

logger = get_logger("my_component")
logger.info("Operation completed", extra={
    "operation": "research_cycle",
    "duration_ms": 1500,
    "failures_found": 3,
})

Security

Audit Trail

All consequential actions are logged:

from tinman.db.audit import AuditLogger

audit = AuditLogger(session)

# Query recent activity
logs = audit.query(
    event_types=["approval_decision", "mode_transition"],
    since=datetime.now() - timedelta(hours=24),
)

Risk Policy

Configure risk evaluation via YAML:

# risk_policy.yaml
base_matrix:
  lab:
    S0: safe
    S1: safe
    S2: review
    S3: review
    S4: block
  shadow:
    S0: safe
    S1: review
    S2: review
    S3: block
    S4: block
  production:
    S0: review
    S1: review
    S2: block
    S3: block
    S4: block

action_overrides:
  DEPLOY_INTERVENTION:
    production: block
  DESTRUCTIVE_TEST:
    shadow: block
    production: block

cost_thresholds:
  low_cost_max: 1.0
  high_cost_min: 10.0

Guarded Tool Execution

All tool calls go through the safety pipeline:

from tinman.core.tools import guarded_call, ToolRegistry

@ToolRegistry.register(
    name="search",
    risk_level=ToolRiskLevel.LOW,
)
async def search_tool(query: str) -> list[str]:
    return await do_search(query)

# Execution is automatically guarded
result = await guarded_call(
    search_tool,
    action_type=ActionType.TOOL_CALL,
    description="Search for relevant documents",
    approval_handler=handler,
    mode=Mode.PRODUCTION,
    query="AI safety",
)

Trace Ingestion

Supported Formats

  • OpenTelemetry (OTLP)
  • Datadog APM
  • AWS X-Ray
  • Generic JSON

Example: OTLP Integration

from tinman.ingest import OTLPAdapter, parse_traces

# From OTLP collector
adapter = OTLPAdapter()
traces = list(adapter.parse(otlp_data))

# Analyze for failures
for trace in traces:
    for span in trace.error_spans:
        print(f"Error: {span.name} - {span.status_message}")

Auto-Detection

from tinman.ingest import parse_traces

# Automatically detects format
traces = parse_traces(unknown_data)

Reporting

Generate Reports

from tinman.reporting import (
    ExecutiveSummaryReport,
    TechnicalAnalysisReport,
    ComplianceReport,
    export_report,
    ReportFormat,
)

# Executive summary
exec_gen = ExecutiveSummaryReport(graph=tinman.graph)
exec_report = await exec_gen.generate(
    period_start=week_ago,
    period_end=now,
)

# Export
export_report(exec_report, "report.html", ReportFormat.HTML)

Report Types

Type Audience Content
Executive Leadership KPIs, trends, risk score
Technical Engineering Failure details, triggers, patterns
Compliance Audit Approvals, transitions, violations

Operational Runbook

Health Check

curl http://localhost:8000/health

Expected response when healthy:

{
  "status": "healthy",
  "version": "0.1.0",
  "mode": "shadow",
  "database_connected": true,
  "llm_available": true,
  "uptime_seconds": 3600,
  "checks": {
    "tinman_initialized": true,
    "database_connected": true,
    "llm_available": true
  }
}

Mode Transition

# Transition to production
curl -X POST http://localhost:8000/mode/transition \
  -H "Content-Type: application/json" \
  -d '{"target_mode": "production"}'

Emergency Rollback

# Force return to shadow mode
export TINMAN_MODE=shadow
kill -HUP $(pidof tinman)

Database Backup

pg_dump -h localhost -U tinman tinman > backup.sql

Troubleshooting

Common Issues

Service won't start:

# Check database connection
alembic current

# Verify environment
env | grep TINMAN

High latency:

# Check metrics
curl http://localhost:9090/metrics | grep latency

Budget exceeded:

# Reset budget period
tracker.reset_period()

Approval queue growing:

# Check pending approvals
curl http://localhost:8000/approvals/pending

Support

  • GitHub Issues: https://github.com/tinman/tinman/issues
  • Documentation: https://tinman.dev/docs