# Production Inbox Deployment Guide

Complete guide for deploying Synqed's production-grade inbox system with cryptographic identity, guaranteed delivery, and distributed tracing.

## Table of Contents

1. [Prerequisites](#prerequisites)
2. [Installation](#installation)
3. [Configuration](#configuration)
4. [Deployment Options](#deployment-options)
5. [Migration from Prototype](#migration-from-prototype)
6. [Monitoring](#monitoring)
7. [Troubleshooting](#troubleshooting)
8. [Security Checklist](#security-checklist)

---

## Prerequisites

### Required Services

1. **Redis 7.0+** (for message queue)
   - Used for: Message queuing, retry logic, DLQ
   - Minimum: Single instance
   - Recommended: Redis Cluster or managed service (AWS ElastiCache, Redis Cloud)

2. **Python 3.10+**
   - Required for: All runtime components

### Required Libraries

```bash
pip install synqed cryptography>=41.0.0 redis>=5.0.0 httpx>=0.24.0
```

Or with all optional dependencies:

```bash
pip install synqed[all]
```

---

## Installation

### 1. Update Dependencies

Update your `pyproject.toml` or `requirements.txt`:

```toml
# pyproject.toml
[project]
dependencies = [
    "synqed>=1.0.93",
    "cryptography>=41.0.0",
    "redis>=5.0.0",
    "httpx>=0.24.0",
    "uvicorn>=0.20.0",
]
```

Or in `requirements.txt`:

```txt
synqed>=1.0.93
cryptography>=41.0.0
redis>=5.0.0
httpx>=0.24.0
uvicorn>=0.20.0
```

### 2. Install

```bash
pip install -r requirements.txt
```

---

## Configuration

### Environment Variables

Create `.env` file:

```bash
# Redis Configuration
REDIS_URL=redis://localhost:6379
# For Redis with AUTH:
# REDIS_URL=redis://:password@localhost:6379
# For Redis Cluster:
# REDIS_URL=redis://node1:6379,node2:6379,node3:6379

# Rate Limiting
SENDER_RATE_LIMIT=100          # requests per minute per sender
IP_RATE_LIMIT=500              # requests per minute per IP

# Queue Configuration
MAX_RETRIES=5                  # max retry attempts before DLQ
INITIAL_BACKOFF_MS=100         # initial backoff in milliseconds
MAX_BACKOFF_MS=30000          # max backoff in milliseconds

# HTTP Configuration
HTTP_TIMEOUT=30.0              # timeout for remote forwarding

# Logging
LOG_LEVEL=INFO
```

### Application Setup

Create `main.py`:

```python
import os
from fastapi import FastAPI
from synqed.agent_email.inbox import router
from synqed.agent_email.inbox.startup import create_lifespan

# Get configuration from environment
REDIS_URL = os.getenv("REDIS_URL", "redis://localhost:6379")

# Create FastAPI app with lifespan
app = FastAPI(
    title="Synqed Agent Email System",
    version="2.0.0",
    lifespan=create_lifespan(redis_url=REDIS_URL),
)

# Include inbox router
app.include_router(router)

# Health check endpoint
@app.get("/health")
async def health():
    return {"status": "healthy"}

if __name__ == "__main__":
    import uvicorn
    uvicorn.run(app, host="0.0.0.0", port=8000)
```

### Initialize Agents

Create `init_agents.py`:

```python
"""Initialize agent registry with keypairs."""

import asyncio
from synqed.agent_email.registry.api import get_registry
from synqed.agent_email.registry.models import AgentRegistryEntry
from synqed.agent_email.inbox import generate_keypair
import json

async def initialize_agents():
    """Generate keypairs and register agents."""
    
    registry = get_registry()
    keypairs = {}
    
    # Define your agents
    agents = [
        {
            "agent_id": "agent://yourorg/agent1",
            "email_like": "agent1@yourorg",
            "inbox_url": "https://agent1.yourorg.com/v1/a2a/inbox",
            "capabilities": ["a2a/1.0", "task-processing"],
        },
        {
            "agent_id": "agent://yourorg/agent2",
            "email_like": "agent2@yourorg",
            "inbox_url": "https://agent2.yourorg.com/v1/a2a/inbox",
            "capabilities": ["a2a/1.0", "data-analysis"],
        },
    ]
    
    # Generate keypairs and register
    for agent in agents:
        private_key, public_key = generate_keypair()
        
        # Register agent
        registry.register(AgentRegistryEntry(
            agent_id=agent["agent_id"],
            email_like=agent["email_like"],
            inbox_url=agent["inbox_url"],
            public_key=public_key,
            capabilities=agent.get("capabilities", []),
            metadata=agent.get("metadata", {}),
        ))
        
        # Store keypair (save securely!)
        keypairs[agent["agent_id"]] = {
            "private_key": private_key,
            "public_key": public_key,
        }
        
        print(f"Registered: {agent['agent_id']}")
    
    # Save keypairs to secure storage
    # WARNING: In production, use a secrets manager (AWS Secrets Manager, HashiCorp Vault, etc.)
    with open("keypairs.json", "w") as f:
        json.dump(keypairs, f, indent=2)
    
    print(f"\n⚠️  IMPORTANT: Store keypairs.json securely!")
    print(f"✓ Registered {len(agents)} agents")

if __name__ == "__main__":
    asyncio.run(initialize_agents())
```

Run initialization:

```bash
python init_agents.py
# Store keypairs.json in your secrets manager
```

---

## Deployment Options

### Option 1: Single Server (Development/Small Scale)

**Setup Redis:**

```bash
# Using Docker
docker run -d \
  --name redis \
  -p 6379:6379 \
  redis:7-alpine
```

**Run Application:**

```bash
# Direct execution
python main.py

# Or with uvicorn
uvicorn main:app --host 0.0.0.0 --port 8000
```

**Pros:**
- Simple setup
- Good for development/testing
- Low resource usage

**Cons:**
- Single point of failure
- Limited scalability

---

### Option 2: Multi-Worker (Production)

**Setup Redis:**

```bash
# Managed Redis recommended (AWS ElastiCache, Redis Cloud, etc.)
# Or run Redis with persistence:
docker run -d \
  --name redis \
  -p 6379:6379 \
  -v redis-data:/data \
  redis:7-alpine redis-server --appendonly yes
```

**Run with Multiple Workers:**

```bash
# Using uvicorn with workers
uvicorn main:app \
  --host 0.0.0.0 \
  --port 8000 \
  --workers 4 \
  --log-level info

# Or using gunicorn with uvicorn workers
gunicorn main:app \
  --workers 4 \
  --worker-class uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000 \
  --timeout 60
```

**Systemd Service (`/etc/systemd/system/synqed.service`):**

```ini
[Unit]
Description=Synqed Agent Email Service
After=network.target redis.service

[Service]
Type=notify
User=synqed
Group=synqed
WorkingDirectory=/opt/synqed
Environment="REDIS_URL=redis://localhost:6379"
Environment="LOG_LEVEL=INFO"
ExecStart=/opt/synqed/venv/bin/uvicorn main:app \
  --host 0.0.0.0 \
  --port 8000 \
  --workers 4 \
  --log-level info
Restart=always
RestartSec=10

[Install]
WantedBy=multi-user.target
```

Enable and start:

```bash
sudo systemctl enable synqed
sudo systemctl start synqed
sudo systemctl status synqed
```

**Pros:**
- High availability
- Automatic failover
- Scales to moderate load

**Cons:**
- Requires proper Redis setup
- More complex configuration

---

### Option 3: Docker Container

**Dockerfile:**

```dockerfile
FROM python:3.11-slim

# Install dependencies
RUN apt-get update && apt-get install -y \
    gcc \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app

# Copy requirements
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Create non-root user
RUN useradd -m -u 1000 synqed && \
    chown -R synqed:synqed /app
USER synqed

# Expose port
EXPOSE 8000

# Run application
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000", "--workers", "2"]
```

**docker-compose.yml:**

```yaml
version: '3.8'

services:
  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"
    volumes:
      - redis-data:/data
    command: redis-server --appendonly yes
    healthcheck:
      test: ["CMD", "redis-cli", "ping"]
      interval: 10s
      timeout: 3s
      retries: 3

  synqed:
    build: .
    ports:
      - "8000:8000"
    environment:
      - REDIS_URL=redis://redis:6379
      - LOG_LEVEL=INFO
      - SENDER_RATE_LIMIT=100
      - IP_RATE_LIMIT=500
    depends_on:
      redis:
        condition: service_healthy
    restart: unless-stopped
    deploy:
      replicas: 2
      resources:
        limits:
          cpus: '1'
          memory: 1G

volumes:
  redis-data:
```

**Deploy:**

```bash
# Build and start
docker-compose up -d

# Scale workers
docker-compose up -d --scale synqed=4

# View logs
docker-compose logs -f synqed

# Stop
docker-compose down
```

**Pros:**
- Isolated environment
- Easy scaling
- Reproducible deployments

**Cons:**
- Requires Docker knowledge
- Additional overhead

---

### Option 4: Kubernetes (Large Scale)

**deployment.yaml:**

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: synqed-inbox
  labels:
    app: synqed
spec:
  replicas: 4
  selector:
    matchLabels:
      app: synqed
  template:
    metadata:
      labels:
        app: synqed
    spec:
      containers:
      - name: synqed
        image: yourregistry/synqed:latest
        ports:
        - containerPort: 8000
        env:
        - name: REDIS_URL
          valueFrom:
            secretKeyRef:
              name: synqed-secrets
              key: redis-url
        - name: LOG_LEVEL
          value: "INFO"
        resources:
          requests:
            memory: "512Mi"
            cpu: "500m"
          limits:
            memory: "1Gi"
            cpu: "1000m"
        livenessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 10
          periodSeconds: 5

---
apiVersion: v1
kind: Service
metadata:
  name: synqed-service
spec:
  selector:
    app: synqed
  ports:
  - port: 80
    targetPort: 8000
  type: LoadBalancer
```

**Deploy:**

```bash
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml
kubectl get pods -l app=synqed
```

**Pros:**
- Automatic scaling
- Self-healing
- Production-grade orchestration

**Cons:**
- Complex setup
- Requires K8s expertise

---

## Migration from Prototype

### Step 1: Backup Current System

```bash
# Backup registry data
python -c "
from synqed.agent_email.registry.api import get_registry
import json
registry = get_registry()
agents = [e.model_dump() for e in registry.list_all()]
with open('registry_backup.json', 'w') as f:
    json.dump(agents, f, indent=2)
"
```

### Step 2: Install New Dependencies

```bash
pip install cryptography>=41.0.0 redis>=5.0.0
```

### Step 3: Setup Redis

```bash
docker run -d -p 6379:6379 redis:7-alpine
```

### Step 4: Generate Keypairs for Existing Agents

```python
"""migrate_agents.py - Add keypairs to existing agents."""

import json
from synqed.agent_email.registry.api import get_registry
from synqed.agent_email.inbox import generate_keypair

# Load backup
with open('registry_backup.json', 'r') as f:
    agents = json.load(f)

registry = get_registry()
keypairs = {}

for agent in agents:
    # Generate keypair
    private_key, public_key = generate_keypair()
    
    # Update agent with public_key
    agent['public_key'] = public_key
    
    # Re-register
    from synqed.agent_email.registry.models import AgentRegistryEntry
    registry.register(AgentRegistryEntry(**agent))
    
    # Store keypair
    keypairs[agent['agent_id']] = {
        'private_key': private_key,
        'public_key': public_key,
    }

# Save keypairs securely
with open('keypairs.json', 'w') as f:
    json.dump(keypairs, f, indent=2)

print(f"✓ Migrated {len(agents)} agents")
print("⚠️  Store keypairs.json securely!")
```

### Step 5: Update Application Code

**Before (prototype):**

```python
from fastapi import FastAPI
from synqed.agent_email.inbox import router

app = FastAPI()
app.include_router(router)
```

**After (production):**

```python
from fastapi import FastAPI
from synqed.agent_email.inbox import router
from synqed.agent_email.inbox.startup import create_lifespan

app = FastAPI(
    lifespan=create_lifespan("redis://localhost:6379")
)
app.include_router(router)
```

### Step 6: Update Message Sending Code

**Before (prototype):**

```python
response = httpx.post(
    inbox_url,
    json={
        "sender": sender_id,
        "recipient": recipient_id,
        "message": message,
    }
)
```

**After (production):**

```python
from synqed.agent_email.inbox import sign_message

# Sign message
signature = sign_message(
    private_key_b64=private_key,
    sender=sender_id,
    recipient=recipient_id,
    message=message,
    thread_id=message['thread_id'],
)

# Send with signature
response = httpx.post(
    inbox_url,
    json={
        "sender": sender_id,
        "recipient": recipient_id,
        "message": message,
        "signature": signature,
    }
)
```

### Step 7: Test Migration

```bash
# Run tests
pytest tests/

# Start application
python main.py

# Send test message
python test_send.py
```

### Step 8: Deploy

```bash
# Stop old version
sudo systemctl stop synqed-old

# Deploy new version
sudo systemctl start synqed

# Monitor logs
sudo journalctl -u synqed -f
```

---

## Monitoring

### Health Checks

```bash
# Application health
curl http://localhost:8000/health

# Redis health
redis-cli ping
```

### Queue Monitoring

```python
"""monitor_queues.py - Monitor queue health."""

import asyncio
from synqed.agent_email.inbox import get_message_queue
from synqed.agent_email.registry.api import get_registry

async def monitor():
    queue = get_message_queue()
    await queue.connect()
    
    registry = get_registry()
    
    for entry in registry.list_all():
        pending = await queue.get_queue_length(entry.agent_id)
        dlq = await queue.get_dlq_length(entry.agent_id)
        
        print(f"{entry.agent_id}:")
        print(f"  Pending: {pending}")
        print(f"  DLQ: {dlq}")
    
    await queue.close()

asyncio.run(monitor())
```

### Redis Metrics

```bash
# Queue lengths
redis-cli XLEN agent_inbox:agent://yourorg/agent1

# DLQ lengths
redis-cli XLEN agent_inbox_dlq:agent://yourorg/agent1

# Memory usage
redis-cli INFO memory

# Connection stats
redis-cli INFO clients
```

### Logging

Configure structured logging:

```python
import logging
import json

class JSONFormatter(logging.Formatter):
    def format(self, record):
        log_data = {
            'timestamp': self.formatTime(record),
            'level': record.levelname,
            'message': record.getMessage(),
            'module': record.module,
        }
        return json.dumps(log_data)

handler = logging.StreamHandler()
handler.setFormatter(JSONFormatter())
logging.root.addHandler(handler)
logging.root.setLevel(logging.INFO)
```

### Metrics Collection (Prometheus)

```python
from prometheus_client import Counter, Histogram, generate_latest
from fastapi import Response

# Define metrics
messages_received = Counter('inbox_messages_received_total', 'Total messages received')
messages_processed = Counter('inbox_messages_processed_total', 'Total messages processed')
processing_duration = Histogram('inbox_processing_duration_seconds', 'Message processing duration')

@app.get("/metrics")
async def metrics():
    return Response(content=generate_latest(), media_type="text/plain")
```

---

## Troubleshooting

### Messages Not Being Processed

**Symptom:** Messages queued but not delivered

**Check:**

```bash
# 1. Verify workers are running
ps aux | grep uvicorn

# 2. Check Redis connection
redis-cli ping

# 3. Check queue length
redis-cli XLEN agent_inbox:agent://yourorg/agent1

# 4. Check application logs
tail -f /var/log/synqed/app.log
```

**Fix:**

```bash
# Restart application
sudo systemctl restart synqed

# Check for errors
sudo journalctl -u synqed -n 100
```

### High DLQ Count

**Symptom:** Many messages in dead letter queue

**Check:**

```bash
# View DLQ messages
redis-cli XRANGE agent_inbox_dlq:agent://yourorg/agent1 - + COUNT 10
```

**Common causes:**

1. **Remote endpoint down:** Check if remote inbox_url is accessible
2. **Invalid signatures:** Verify public keys are correct
3. **Network timeouts:** Increase HTTP_TIMEOUT

**Fix:**

```python
# Replay messages from DLQ
async def replay_dlq(agent_id: str):
    # Manually retrieve and reprocess DLQ messages
    # Implementation depends on error type
    pass
```

### Rate Limiting Issues

**Symptom:** Legitimate requests getting 429 errors

**Fix:**

Update environment variables:

```bash
export SENDER_RATE_LIMIT=200
export IP_RATE_LIMIT=1000
```

Or programmatically:

```python
from synqed.agent_email.inbox import get_rate_limiter

limiter = get_rate_limiter()
limiter.sender_limit = 200
limiter.ip_limit = 1000
```

### Signature Verification Failing

**Symptom:** All messages rejected with "signature verification failed"

**Check:**

1. Sender registered with correct public_key
2. Message includes all required fields (thread_id, role, content)
3. Signature computed correctly

**Debug:**

```python
from synqed.agent_email.inbox import verify_signature

is_valid = verify_signature(
    public_key_b64=public_key,
    signature_b64=signature,
    sender=sender_id,
    recipient=recipient_id,
    message=message,
    thread_id=thread_id,
)

print(f"Signature valid: {is_valid}")
```

---

## Security Checklist

### ✓ Pre-Deployment

- [ ] Redis secured with AUTH password
- [ ] Redis accessible only from application hosts
- [ ] TLS enabled for Redis connections
- [ ] Keypairs stored in secrets manager (not files)
- [ ] HTTPS enforced for all inbox URLs
- [ ] Rate limits configured appropriately
- [ ] Logging configured (no sensitive data)
- [ ] Network firewalls configured
- [ ] Application runs as non-root user

### ✓ Post-Deployment

- [ ] Verify signature verification working
- [ ] Test rate limiting
- [ ] Monitor queue depths
- [ ] Set up alerting for DLQ growth
- [ ] Configure log aggregation
- [ ] Test disaster recovery
- [ ] Document runbooks
- [ ] Schedule key rotation

### ✓ Ongoing

- [ ] Rotate Ed25519 keypairs quarterly
- [ ] Update dependencies monthly
- [ ] Review access logs weekly
- [ ] Monitor error rates
- [ ] Backup Redis data
- [ ] Test failover procedures

---

## Production Checklist

Before going live:

- [ ] Redis cluster with replication
- [ ] Multiple application workers
- [ ] Load balancer configured
- [ ] Health checks enabled
- [ ] Monitoring and alerting
- [ ] Log aggregation
- [ ] Backup and disaster recovery
- [ ] Security audit completed
- [ ] Load testing performed
- [ ] Documentation updated
- [ ] Team trained on operations

---

## Support

For issues or questions:

1. Check logs: `journalctl -u synqed -f`
2. Review queue metrics: `python monitor_queues.py`
3. Check Redis: `redis-cli INFO`
4. See README: `src/synqed/agent_email/inbox/README.md`
5. Run example: `python examples/production_inbox_demo.py`

---

## Performance Tuning

### Redis Optimization

```conf
# redis.conf
maxmemory 2gb
maxmemory-policy allkeys-lru
appendonly yes
appendfsync everysec
```

### Application Tuning

```python
# Increase worker pool
uvicorn main:app --workers 8

# Adjust timeouts
HTTP_TIMEOUT=60.0

# Increase rate limits
SENDER_RATE_LIMIT=200
IP_RATE_LIMIT=1000
```

### Load Testing

```bash
# Install hey
go install github.com/rakyll/hey@latest

# Test inbox endpoint
hey -n 1000 -c 10 -m POST \
  -H "Content-Type: application/json" \
  -d '{"sender":"...","recipient":"...","message":{...},"signature":"..."}' \
  http://localhost:8000/v1/a2a/inbox
```

---

**You now have a production-grade agent email system! 🚀**

