# WISTX - MCP Context Server for DevOps

MCP (Model Context Protocol) server providing compliance, pricing, and best practices context for DevOps infrastructure.

## 🎯 What is WISTX?

WISTX is an MCP server that provides context to LLMs (Claude, GPT-4, etc.) about:
- **Compliance Requirements** (PCI-DSS, HIPAA, CIS, SOC2, NIST, ISO 27001)
- **Infrastructure Pricing** (AWS, GCP, Azure)
- **Code Examples** (Terraform, Kubernetes, Docker)
- **Best Practices** (DevOps, security, cost optimization)

Users interact with WISTX through:
- **MCP Protocol** - Native integration with Claude Desktop, Cursor, Windsurf, **Google Antigravity**
- **REST API** - For CI/CD pipelines, scripts, and programmatic access

## Features

- 🚀 **MCP Server** - Native Claude Desktop, Cursor, Windsurf, and **Google Antigravity** integration
- 📊 **REST API** - Simple HTTP endpoints for context retrieval
- 🔍 **Vector Search** - Semantic search across 50K+ compliance controls
- 💰 **Cost Calculator** - Real-time infrastructure pricing
- 📝 **Code Examples** - 500K+ Terraform/K8s examples
- 🔐 **API Key Authentication** - Secure access control
- 📈 **Usage Tracking** - Monitor API usage and billing
- ⚡ **One-Click Installation** - Available in Antigravity's MCP Server Store

## Requirements

- Python 3.11 or higher
- [uv](https://github.com/astral-sh/uv) (install with: `curl -LsSf https://astral.sh/uv/install.sh | sh`)
- MongoDB (for data storage)
- Docker (optional, for containerized deployment)

## Installation

### Development Setup

1. Clone the repository:
   ```bash
   git clone <repository-url>
   cd wistx-model
   ```

2. Install uv (if not already installed):
   ```bash
   curl -LsSf https://astral.sh/uv/install.sh | sh
   ```

3. Copy environment variables:
   ```bash
   cp .env.example .env
   # Edit .env with your configuration
   # IMPORTANT: Add your OpenAI API key for data processing:
   # OPENAI_API_KEY=your-openai-api-key-here
   ```

4. Install the project and dependencies:
   ```bash
   uv sync
   ```

5. Run the API server:
   ```bash
   uv run uvicorn api.main:app --reload
   ```

## Architecture

```
┌─────────────────────────────────────────┐
│  Claude Desktop / Cursor / Windsurf /   │
│  Google Antigravity                     │
│  (User asks: "Create compliant RDS")    │
└──────────────┬──────────────────────────┘
               │ MCP Protocol
               ↓
┌─────────────────────────────────────────┐
│  WISTX MCP Server                      │
│  ├─ get_compliance_requirements        │
│  ├─ calculate_infrastructure_cost       │
│  ├─ get_devops_infra_code_examples     │
│  └─ search_best_practices              │
└──────────────┬──────────────────────────┘
               │
               ↓
┌─────────────────────────────────────────┐
│  MongoDB Atlas                          │
│  ├─ compliance_controls (50K+)         │
│  ├─ pricing_data (105K+)               │
│  ├─ code_examples (500K+)               │
│  └─ best_practices (100K+)              │
└─────────────────────────────────────────┘
```

## Project Structure

```
wistx-model/
├── api/                          # REST API (FastAPI)
│   ├── routers/v1/               # API endpoints
│   │   ├── health.py             # GET /health
│   │   ├── usage.py              # GET /v1/usage
│   │   ├── compliance.py         # GET /v1/compliance (TODO)
│   │   ├── pricing.py            # GET /v1/pricing (TODO)
│   │   └── code.py               # GET /v1/code-examples (TODO)
│   ├── middleware/               # API middleware
│   │   ├── auth.py               # API key authentication
│   │   ├── rate_limit.py         # Rate limiting
│   │   └── logging.py            # Request logging
│   ├── services/                 # Business logic
│   │   ├── billing_service.py   # Billing logic
│   │   ├── usage_tracker.py      # Usage tracking
│   │   └── token_counter.py      # Token counting
│   ├── database/                 # MongoDB connection
│   └── auth/                     # Authentication
│
├── wistx_mcp/                    # MCP Server (TODO - Week 5)
│   ├── server.py                 # MCP server main
│   └── tools/                    # MCP tools
│       ├── compliance.py         # Compliance context
│       ├── pricing.py            # Pricing context
│       ├── code_examples.py      # Code examples
│       └── lib/                  # Shared utilities
│           ├── mongodb_client.py # MongoDB queries
│           ├── vector_search.py  # Vector search
│           └── context_builder.py # Context formatting
│
├── data-pipelines/               # Data collection & processing
│   ├── collectors/               # Data collectors
│   ├── processors/               # Data processors
│   ├── loaders/                  # MongoDB loaders
│   └── models/                   # Data models
│
├── scripts/                       # Utility scripts
│   ├── setup_mongodb.py          # MongoDB setup
│   ├── run_compliance_collection.py  # Data collection
│   └── validate_mongodb_complete.py  # Validation
│
├── tests/                        # Test suite
├── pyproject.toml                # Project configuration
├── docker-compose.yml            # Docker Compose setup
└── README.md
```

## Data Processing Pipeline

The data processing pipeline collects, processes, embeds, and loads compliance data into MongoDB and Pinecone.

### Prerequisites

1. **Environment Variables** - Ensure your `.env` file includes:
   ```bash
   # Required for data processing
   OPENAI_API_KEY=your-openai-api-key-here
   MONGODB_URL=mongodb://localhost:27017
   MONGODB_DATABASE=wistx-production
   PINECONE_API_KEY=your-pinecone-api-key
   PINECONE_INDEX_NAME=wistx-index
   ```

2. **MongoDB Setup** - Initialize MongoDB collections and indexes:
   ```bash
   python scripts/setup_mongodb.py
   ```

3. **Dependencies** - Install required packages:
   ```bash
   uv sync
   # Optional: For PDF processing
   uv add docling
   ```

### Running the Pipeline

**Process a single compliance standard (full pipeline):**
```bash
python scripts/run_pipeline.py --standard PCI-DSS --mode streaming
```

**Process all compliance standards:**
```bash
python scripts/run_pipeline.py --mode streaming
```

**Skip collection stage (use existing raw data):**
```bash
python scripts/run_pipeline.py --standard PCI-DSS --mode streaming --no-collection
```

**Development mode (saves intermediate files for debugging):**
```bash
python scripts/run_pipeline.py --standard PCI-DSS --mode checkpointing
```

### Pipeline Modes

- **Streaming Mode** (`--mode streaming`): Production mode - no intermediate files saved, faster execution
- **Checkpointing Mode** (`--mode checkpointing`): Development mode - saves intermediate files at each stage for debugging and resuming

### Available Standards

- `PCI-DSS`, `CIS`, `HIPAA`, `SOC2`, `NIST-800-53`, `ISO-27001`, `GDPR`, `FedRAMP`, `CCPA`, `SOX`, `GLBA`

### Pipeline Stages

1. **Collection** - Scrapes URLs/web pages (Crawl4AI) and processes PDFs (Docling)
2. **Processing** - Standardizes raw data into `ComplianceControl` models
3. **Embedding** - Generates vector embeddings using OpenAI API
4. **Loading** - Stores documents in MongoDB and vectors in Pinecone

For detailed pipeline documentation, see [data-pipelines/HOW_TO_RUN_PIPELINE.md](data-pipelines/HOW_TO_RUN_PIPELINE.md)

## Development

### Running Tests

```bash
# Run all tests
uv run pytest

# Run with coverage
uv run pytest --cov

# Run specific test file
uv run pytest tests/api/test_messages.py
```

### Code Quality

```bash
# Format code with black
uv run black .

# Lint with ruff
uv run ruff check .

# Type check with mypy
uv run mypy .
```

### Adding Dependencies

```bash
# Add a runtime dependency
uv add <package-name>

# Add a development dependency
uv add --group dev <package-name>
```

## Docker Deployment

```bash
# Start all services
docker-compose up -d

# View logs
docker-compose logs -f

# Stop services
docker-compose down
```

## API Endpoints

### Current
- `GET /health` - Health check
- `GET /v1/usage` - Get usage statistics

### Planned (Week 6)
- `GET /v1/compliance` - Get compliance requirements
- `GET /v1/pricing` - Calculate infrastructure costs
- `GET /v1/code-examples` - Search code examples
- `GET /v1/best-practices` - Search best practices

## OpenAPI Specification

The REST API exposes an OpenAPI specification:
- **OpenAPI spec**: http://localhost:8000/openapi.json
- **Interactive docs**: http://localhost:8000/docs

SDKs can be generated from the OpenAPI spec using `openapi-generator` if needed.

The MCP server provides these tools:
- `get_compliance_requirements` - Get compliance controls for resources
- `calculate_infrastructure_cost` - Calculate costs for infrastructure
- `get_code_examples` - Get relevant code examples
- `search_best_practices` - Search DevOps best practices
- `check_compliance_violations` - Check code for compliance issues
- `suggest_cost_optimizations` - Suggest cost savings

## Documentation

- [MCP Architecture Guide](mcp-doc.md) - Complete MCP implementation guide
- [MongoDB Setup](mongodb.md) - MongoDB configuration and schema
- [Data Pipeline](data-pipeline.md) - Data collection and processing
- [Architecture Review](ARCHITECTURE_REVIEW.md) - Migration details
- [Google Antigravity Setup](wistx_mcp/docs/ANTIGRAVITY_SETUP.md) - Antigravity IDE integration guide

## License

MIT License

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.