Metadata-Version: 2.4
Name: pb-dolphin
Version: 0.1.9
Summary: Full-stack AI enablement platform
Author-email: "Plastic Beach, LLC" <taylor@plasticbeach.email>, tdc93 <taylor@plasticbeach.email>
Maintainer-email: "Plastic Beach, LLC" <taylor@plasticbeach.email>, tdc93 <taylor@plasticbeach.email>
License: MIT
Keywords: ai,knowledge-base,ml,nlp
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Requires-Python: >=3.12
Requires-Dist: docker>=7.1.0
Requires-Dist: fastapi
Requires-Dist: lancedb
Requires-Dist: markdown-it-py
Requires-Dist: mcp-server-git>=0.0.1
Requires-Dist: mcp-server-time>=0.1.0
Requires-Dist: mcpo>=0.0.19
Requires-Dist: openai
Requires-Dist: pathspec
Requires-Dist: pydantic
Requires-Dist: python-dotenv
Requires-Dist: pyyaml
Requires-Dist: requests>=2.32.4
Requires-Dist: sentence-transformers>=2.3.0
Requires-Dist: sqlite-utils
Requires-Dist: sqlmodel
Requires-Dist: tiktoken
Requires-Dist: tomli; python_full_version < '3.11'
Requires-Dist: torch>=2.2.0
Requires-Dist: tree-sitter-javascript>=0.25.0
Requires-Dist: tree-sitter-python>=0.25.0
Requires-Dist: tree-sitter>=0.25.0
Requires-Dist: typer
Requires-Dist: uvicorn
Provides-Extra: dev
Requires-Dist: black>=23.0.0; extra == 'dev'
Requires-Dist: isort>=5.12.0; extra == 'dev'
Requires-Dist: mypy>=1.5.0; extra == 'dev'
Requires-Dist: pre-commit>=3.4.0; extra == 'dev'
Requires-Dist: ruff>=0.1.0; extra == 'dev'
Provides-Extra: test
Requires-Dist: fakeredis>=2.18.0; extra == 'test'
Requires-Dist: freezegun>=1.2.0; extra == 'test'
Requires-Dist: httpx>=0.25.0; extra == 'test'
Requires-Dist: pytest-asyncio>=0.21.0; extra == 'test'
Requires-Dist: pytest-cov>=4.1.0; extra == 'test'
Requires-Dist: pytest-mock>=3.11.0; extra == 'test'
Requires-Dist: pytest-xdist>=3.3.0; extra == 'test'
Requires-Dist: pytest>=7.4.0; extra == 'test'
Requires-Dist: responses>=0.24.0; extra == 'test'
Description-Content-Type: text/markdown

# 🐬 dolphin

**⚠️ EXPERIMENTAL - This is a developmental library under active development. APIs and interfaces are unstable and subject to change without notice.**

A semantic code search and knowledge management system with AI-native interfaces (MCP, REST API, CLI).

## Quick Start

### Installation

```bash
# Install from PyPI
uvx install pb-dolphin

# ⚠️ IMPORTANT: Ensure OPENAI_API_KEY is set as env var
```

### Basic Usage

```bash
# Initialize global knowledge store and index a repository
dolphin init
dolphin add-repo my-project /path/to/project
dolphin index my-project

# Search your indexed code
dolphin search "authentication logic"

# Start API server
dolphin serve
```

## Core Commands

- `dolphin init` - Initialize configuration (auto-creates `~/.dolphin/config.toml`)
- `dolphin init --repo` - Create repo-specific config in current directory
- `dolphin add-repo <name> <path>` - Register a repository for indexing
- `dolphin index <name>` - Index a repository with language-aware chunking
- `dolphin search <query>` - Search indexed code semantically
- `dolphin serve` - Start REST API server (port 7777)
- `dolphin config --show` - Display current configuration

## Architecture

### High-Level Overview

```
┌──────────────────────────────────────────┐
│   AI Interfaces (Claude, Continue, etc)  │
└──────────────┬───────────────────────────┘
               │ MCP Protocol
               ▼
┌──────────────────────────────────────────┐
│          Dolphin Knowledge Base          │
│  ┌─────────────┐    ┌────────────────┐  │
│  │ MCP Bridge  │◄──►│ REST API       │  │
│  │ (TypeScript)│    │ (Python/FastAPI)│  │
│  └─────────────┘    └────────┬────────┘  │
└──────────────────────────────┼───────────┘
                               │
               ┌───────────────┴────────────┐
               ▼                            ▼
          ┌─────────┐                ┌──────────┐
          │LanceDB  │                │ SQLite   │
          │(Vectors)│                │(Metadata)│
          └─────────┘                └──────────┘
```

### Key Features

- **Language-Aware Chunking** - Intelligent code parsing for Python, TypeScript, JavaScript, Markdown
- **Semantic Search** - OpenAI embeddings with LanceDB vector storage
- **MCP Support** - Native Model Context Protocol integration for Claude Desktop
- **REST API** - FastAPI server with search, retrieval, and metadata endpoints
- **Unified CLI** - Single `dolphin` command for all operations
- **Auto-Configuration** - Smart config hierarchy (repo → user → defaults)

## Environment Variables

Dolphin requires the following environment variables depending on your usage:

### Required for OpenAI Embeddings

```bash
# Required when using OpenAI embeddings (recommended for production)
export OPENAI_API_KEY="sk-your-openai-api-key-here"
```

### Getting Your OpenAI API Key

1. Visit [OpenAI Platform](https://platform.openai.com/)
2. Sign up or log in to your account
3. Navigate to [API Keys](https://platform.openai.com/api-keys)
4. Click "Create new secret key"
5. Copy the key and set it as `OPENAI_API_KEY`

## Configuration

Dolphin uses a multi-level configuration system:

1. **Repo-specific** (`./.dolphin/config.toml`) - Per-repository chunking settings
2. **User-global** (`~/.dolphin/config.toml`) - Auto-created on first use

### Example Config

```toml
# ~/.dolphin/config.toml
default_embed_model = "large"  # or "small"

[embedding]
provider = "openai"
batch_size = 100

[retrieval]
top_k = 8
score_cutoff = 0.15
```

## Claude Desktop Integration (MCP)

Add to your `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "dolphin": {
      "command": "bun",
      "args": ["run", "/path/to/dolphin/mcp-bridge/src/index.ts"],
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}
```

Start the server: `dolphin serve`

Available MCP tools: `search_knowledge`, `fetch_chunk`, `fetch_lines`, `get_vector_store_info`

## REST API

```bash
# Start server
dolphin serve

# Search
curl -X POST http://127.0.0.1:7777/v1/search \
  -H "Content-Type: application/json" \
  -d '{"query": "authentication", "top_k": 5}'

# List repositories
curl http://127.0.0.1:7777/v1/repos

# Health check
curl http://127.0.0.1:7777/v1/health
```

## Development Status

**Current**: Pre-alpha (0.1.x)

- ✅ Core indexing and search pipeline
- ✅ Language-aware chunking (Python, TS, JS, Markdown)
- ✅ REST API with MCP bridge
- ⚠️ Developmental stage

**Upcoming**:
- Performance optimization
- Production hardening
- Evaluation framework
- Expanded language support

## Requirements

- Python ≥3.12
- OpenAI API key (for embeddings)
- Bun (for MCP bridge)
- Git (for repository scanning)

## Testing

```bash
# Run all tests
uv run pytest

# Run specific test suite
uv run pytest tests/unit/
uv run pytest tests/integration/
```

## License

MIT License

## Acknowledgments

Built with [LanceDB](https://lancedb.com/), [OpenAI](https://openai.com/), [FastAPI](https://fastapi.tiangolo.com/), and [Bun](https://bun.sh/)

---

**⚠️ Remember**: This is experimental software under active development. Use at your own risk.
