# Production-Grade A2A Inbox System

This module provides an enterprise-ready message inbox system for agent-to-agent communication with guaranteed delivery, cryptographic identity, and abuse protection.

## Architecture Overview

The system uses a **queue-based architecture** that separates message ingestion from processing:

```
┌─────────────────────────────────────────────────────────────┐
│                     POST /v1/a2a/inbox                       │
│                                                               │
│  1. Validate auth                                            │
│  2. Check rate limits (per-sender + per-IP)                 │
│  3. Validate sender/recipient URIs                          │
│  4. Verify Ed25519 signature                                │
│  5. Queue message to Redis Streams                          │
│  6. Return accepted immediately                             │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
                    ┌─────────────────┐
                    │  Redis Streams  │
                    │  agent_inbox:*  │
                    └─────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                      Worker (Async)                          │
│                                                               │
│  1. Consume from queue                                       │
│  2. Try local runtime                                        │
│  3. Else forward to remote inbox_url                        │
│  4. Retry with exponential backoff                          │
│  5. Move to DLQ after max retries                           │
└─────────────────────────────────────────────────────────────┘
```

## Features

### 1. Cryptographic Identity (Ed25519)

Every message must be signed with the sender's private key:

```python
from synqed.agent_email.inbox import generate_keypair, sign_message

# Generate keypair
private_key, public_key = generate_keypair()

# Register agent with public key
registry.register(AgentRegistryEntry(
    agent_id="agent://example/alice",
    email_like="alice@example",
    inbox_url="https://alice.example.com/v1/a2a/inbox",
    public_key=public_key,  # Required!
))

# Sign message
signature = sign_message(
    private_key_b64=private_key,
    sender="agent://example/alice",
    recipient="agent://example/bob",
    message={"thread_id": "123", "role": "user", "content": "Hello"},
    thread_id="123",
)

# Send with signature
response = httpx.post(
    "https://bob.example.com/v1/a2a/inbox",
    json={
        "sender": "agent://example/alice",
        "recipient": "agent://example/bob",
        "message": {"thread_id": "123", "role": "user", "content": "Hello"},
        "signature": signature,
    }
)
```

### 2. Guaranteed Delivery (Redis Streams)

Messages are queued in Redis Streams with automatic retry:

- **Exponential backoff**: 100ms → 200ms → 400ms → 800ms → 1600ms
- **Max retries**: 5 attempts
- **Dead Letter Queue**: Failed messages move to `agent_inbox_dlq:<agent_id>`
- **Consumer groups**: Scalable distributed processing

### 3. Rate Limiting

Protection against abuse:

- **Per-sender limit**: 100 requests/minute
- **Per-IP limit**: 500 requests/minute
- **Response**: 429 Too Many Requests

### 4. Distributed Tracing

Every message gets a `trace_id` for end-to-end tracking:

```json
{
  "status": "accepted",
  "message_id": "550e8400-e29b-41d4-a716-446655440000",
  "trace_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7"
}
```

Chain traces with `parent_trace_id`:

```json
{
  "trace_id": "new-trace-id",
  "parent_trace_id": "parent-trace-id"
}
```

### 5. Error Classification

Errors include `retryable` flag:

```json
{
  "status": "error",
  "error": "timeout forwarding to remote inbox",
  "retryable": true
}
```

**Retryable errors**:
- Network timeouts
- 5xx server errors
- Connection failures

**Non-retryable errors**:
- Invalid signature
- 4xx client errors
- Sender/recipient not found

## Installation

```bash
# Install with dependencies
pip install synqed[all]

# Or manually add:
pip install cryptography>=41.0.0 redis>=5.0.0
```

## Quick Start

### 1. Setup Redis

```bash
# Local development
docker run -d -p 6379:6379 redis:7

# Or use managed Redis (AWS ElastiCache, Redis Cloud, etc.)
```

### 2. Initialize System

```python
from fastapi import FastAPI
from synqed.agent_email.inbox import router
from synqed.agent_email.inbox.startup import create_lifespan

# Create app with lifespan
app = FastAPI(lifespan=create_lifespan("redis://localhost:6379"))

# Include inbox router
app.include_router(router)
```

### 3. Register Agents

```python
from synqed.agent_email.registry.api import get_registry
from synqed.agent_email.registry.models import AgentRegistryEntry
from synqed.agent_email.inbox import generate_keypair

# Generate keypair
private_key, public_key = generate_keypair()

# Register agent
registry = get_registry()
registry.register(AgentRegistryEntry(
    agent_id="agent://example/myagent",
    email_like="myagent@example",
    inbox_url="https://example.com/v1/a2a/inbox",
    public_key=public_key,
))
```

### 4. Send Messages

```python
import httpx
from synqed.agent_email.inbox import sign_message

# Sign message
signature = sign_message(
    private_key_b64=private_key,
    sender="agent://example/alice",
    recipient="agent://example/bob",
    message={
        "thread_id": "thread-123",
        "role": "user",
        "content": "Hello, Bob!"
    },
    thread_id="thread-123",
)

# Send
async with httpx.AsyncClient() as client:
    response = await client.post(
        "https://bob.example.com/v1/a2a/inbox",
        json={
            "sender": "agent://example/alice",
            "recipient": "agent://example/bob",
            "message": {
                "thread_id": "thread-123",
                "role": "user",
                "content": "Hello, Bob!"
            },
            "signature": signature,
        }
    )
    
    result = response.json()
    print(f"Message {result['message_id']} queued with trace {result['trace_id']}")
```

## Local Runtime vs Remote Forwarding

The system supports two delivery modes:

### Local Runtime

For agents hosted in the same process:

```python
from synqed.agent_email.inbox import LocalAgentRuntime, register_agent_runtime

class MyAgent(LocalAgentRuntime):
    async def handle_a2a_envelope(self, sender: str, recipient: str, envelope: dict):
        print(f"Received from {sender}: {envelope}")
        return {"status": "processed"}

# Register runtime
register_agent_runtime("agent://example/myagent", MyAgent())
```

### Remote Forwarding

For agents on different hosts - the worker automatically forwards via HTTP POST to `inbox_url`.

**No code needed!** Just register with `inbox_url` and the system handles routing.

## Production Deployment

### Multi-Process Setup

For production with multiple workers:

```python
# main.py
from fastapi import FastAPI
from synqed.agent_email.inbox import router
from synqed.agent_email.inbox.startup import create_lifespan

app = FastAPI(lifespan=create_lifespan("redis://production-redis:6379"))
app.include_router(router)

# Run with multiple workers
# uvicorn main:app --workers 4 --host 0.0.0.0 --port 8000
```

Redis ensures coordination across all workers.

### Environment Variables

```bash
# Redis connection
REDIS_URL=redis://localhost:6379

# Rate limits
SENDER_RATE_LIMIT=100
IP_RATE_LIMIT=500

# Queue configuration
MAX_RETRIES=5
INITIAL_BACKOFF_MS=100
```

### Monitoring

Check queue health:

```python
from synqed.agent_email.inbox import get_message_queue

queue = get_message_queue()

# Check queue length
pending = await queue.get_queue_length("agent://example/myagent")
print(f"Pending messages: {pending}")

# Check DLQ
failed = await queue.get_dlq_length("agent://example/myagent")
print(f"Failed messages: {failed}")
```

## API Reference

### POST /v1/a2a/inbox

**Request:**

```json
{
  "sender": "agent://example/alice",
  "recipient": "agent://example/bob",
  "message": {
    "thread_id": "thread-123",
    "role": "user",
    "content": "Hello"
  },
  "signature": "base64-encoded-signature",
  "trace_id": "optional-trace-id",
  "parent_trace_id": "optional-parent-trace"
}
```

**Response (Success):**

```json
{
  "status": "accepted",
  "message_id": "550e8400-e29b-41d4-a716-446655440000",
  "trace_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7"
}
```

**Response (Error):**

```json
{
  "status": "error",
  "message_id": "550e8400-e29b-41d4-a716-446655440000",
  "trace_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
  "error": "timeout forwarding to remote inbox",
  "retryable": true
}
```

## Security Considerations

1. **Always verify signatures**: The system rejects unsigned or incorrectly signed messages
2. **Rate limiting**: Prevents DOS attacks
3. **TLS required**: Always use HTTPS in production
4. **Key rotation**: Regularly rotate Ed25519 keypairs
5. **Redis security**: Use Redis AUTH and TLS in production

## Testing

```python
import pytest
from synqed.agent_email.inbox import generate_keypair, sign_message
from synqed.agent_email.inbox.queue import get_message_queue
from synqed.agent_email.inbox.rate_limiter import get_rate_limiter

@pytest.fixture
def keypair():
    return generate_keypair()

@pytest.fixture
async def queue():
    queue = get_message_queue("redis://localhost:6379")
    await queue.connect()
    yield queue
    await queue.close()

async def test_signature_verification(keypair):
    private_key, public_key = keypair
    
    signature = sign_message(
        private_key_b64=private_key,
        sender="agent://test/alice",
        recipient="agent://test/bob",
        message={"thread_id": "1", "role": "user", "content": "test"},
        thread_id="1",
    )
    
    assert len(signature) > 0

async def test_queue_operations(queue):
    # Push message
    stream_id = await queue.push(
        agent_id="agent://test/alice",
        envelope={"sender": "test", "message": "hello"},
        message_id="msg-123",
    )
    
    assert stream_id is not None
    
    # Check length
    length = await queue.get_queue_length("agent://test/alice")
    assert length == 1
```

## Troubleshooting

### Messages not being processed

1. Check if workers are running: `ps aux | grep worker`
2. Check Redis connection: `redis-cli ping`
3. Check queue length: `redis-cli XLEN agent_inbox:<agent_id>`

### High DLQ count

1. Check worker logs for errors
2. Inspect DLQ: `redis-cli XRANGE agent_inbox_dlq:<agent_id> - +`
3. Verify remote endpoints are accessible

### Signature verification failing

1. Ensure sender is registered with correct `public_key`
2. Verify message includes all required fields
3. Check that signature is computed over correct payload

## Migration from Prototype

If upgrading from the old synchronous system:

1. **Add dependencies**: Install `cryptography` and `redis`
2. **Generate keypairs**: For all existing agents
3. **Update registry**: Add `public_key` to all entries
4. **Setup Redis**: Start Redis instance
5. **Add lifespan**: Update FastAPI app to use `create_lifespan()`
6. **Update clients**: Add signature generation to message sends

The API remains backward compatible - existing code will work with minimal changes.

## License

See project LICENSE file.

