Metadata-Version: 2.4
Name: eval-protocol
Version: 0.2.89
Summary: The official Python SDK for Eval Protocol (EP.) EP is an open protocol that standardizes how developers author evals for large language model (LLM) applications.
Author-email: Fireworks AI <info@fireworks.ai>
License-Expression: MIT
Project-URL: Homepage, https://github.com/fireworks-ai/eval-protocol
Classifier: Programming Language :: Python :: 3
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: requests>=2.25.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: dataclasses-json>=0.5.7
Requires-Dist: uvicorn>=0.15.0
Requires-Dist: python-dotenv>=0.19.0
Requires-Dist: openai>=1.78.1
Requires-Dist: aiosqlite
Requires-Dist: aiohttp
Requires-Dist: mcp>=1.9.2
Requires-Dist: PyYAML>=5.0
Requires-Dist: hydra-core>=1.3.2
Requires-Dist: omegaconf>=2.3.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: anthropic>=0.59.0
Requires-Dist: litellm<1.75.0
Requires-Dist: pytest>=6.0.0
Requires-Dist: pytest-asyncio>=0.21.0
Requires-Dist: peewee>=3.18.2
Requires-Dist: backoff>=2.2.0
Requires-Dist: questionary>=2.0.0
Requires-Dist: toml>=0.10.0
Requires-Dist: loguru>=0.6.0
Requires-Dist: docstring-parser>=0.15
Requires-Dist: rich>=12.0.0
Requires-Dist: psutil>=5.8.0
Requires-Dist: addict>=2.4.0
Requires-Dist: deepdiff>=6.0.0
Requires-Dist: websockets>=15.0.1
Requires-Dist: fastapi>=0.116.1
Provides-Extra: dev
Requires-Dist: build; extra == "dev"
Requires-Dist: twine; extra == "dev"
Requires-Dist: pytest-httpserver; extra == "dev"
Requires-Dist: werkzeug>=2.0.0; extra == "dev"
Requires-Dist: ruff>=0.5.0; extra == "dev"
Requires-Dist: transformers>=4.0.0; extra == "dev"
Requires-Dist: pandas>=1.5.0; extra == "dev"
Requires-Dist: types-setuptools; extra == "dev"
Requires-Dist: types-requests; extra == "dev"
Requires-Dist: types-PyYAML; extra == "dev"
Requires-Dist: types-docker; extra == "dev"
Requires-Dist: versioneer>=0.20; extra == "dev"
Requires-Dist: openai>=1.78.1; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: e2b; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pytest-xdist; extra == "dev"
Requires-Dist: docker==7.1.0; extra == "dev"
Requires-Dist: ipykernel>=6.30.0; extra == "dev"
Requires-Dist: jupyter>=1.1.1; extra == "dev"
Requires-Dist: pip>=25.1.1; extra == "dev"
Requires-Dist: haikus==0.3.8; extra == "dev"
Requires-Dist: syrupy>=4.0.0; extra == "dev"
Requires-Dist: gymnasium>=1.2.0; extra == "dev"
Provides-Extra: trl
Requires-Dist: torch>=1.9; extra == "trl"
Requires-Dist: trl>=0.7.0; extra == "trl"
Requires-Dist: peft>=0.7.0; extra == "trl"
Requires-Dist: transformers>=4.0.0; extra == "trl"
Requires-Dist: accelerate>=0.28.0; extra == "trl"
Provides-Extra: openevals
Requires-Dist: openevals>=0.1.0; extra == "openevals"
Provides-Extra: fireworks
Requires-Dist: fireworks-ai>=0.19.19; extra == "fireworks"
Provides-Extra: box2d
Requires-Dist: swig; extra == "box2d"
Requires-Dist: gymnasium[box2d]>=0.29.0; extra == "box2d"
Requires-Dist: Pillow; extra == "box2d"
Provides-Extra: langfuse
Requires-Dist: langfuse>=2.0.0; extra == "langfuse"
Provides-Extra: huggingface
Requires-Dist: datasets>=3.0.0; extra == "huggingface"
Requires-Dist: transformers>=4.0.0; extra == "huggingface"
Provides-Extra: langsmith
Requires-Dist: langsmith>=0.1.86; extra == "langsmith"
Provides-Extra: bigquery
Requires-Dist: google-cloud-bigquery>=3.0.0; extra == "bigquery"
Requires-Dist: google-auth>=2.0.0; extra == "bigquery"
Provides-Extra: svgbench
Requires-Dist: selenium>=4.0.0; extra == "svgbench"
Provides-Extra: pydantic
Requires-Dist: pydantic-ai>=1.0.2; extra == "pydantic"
Provides-Extra: supabase
Requires-Dist: supabase>=2.18.1; extra == "supabase"
Provides-Extra: chinook
Requires-Dist: psycopg2-binary>=2.9.10; extra == "chinook"
Provides-Extra: langchain
Requires-Dist: langchain-core>=0.3.0; extra == "langchain"
Provides-Extra: braintrust
Requires-Dist: braintrust[otel]; extra == "braintrust"
Provides-Extra: langgraph
Requires-Dist: langgraph>=0.6.7; extra == "langgraph"
Requires-Dist: langchain-core>=0.3.75; extra == "langgraph"
Provides-Extra: langgraph-tools
Requires-Dist: langgraph>=0.6.7; extra == "langgraph-tools"
Requires-Dist: langchain>=0.3.0; extra == "langgraph-tools"
Requires-Dist: langchain-fireworks>=0.3.0; extra == "langgraph-tools"
Provides-Extra: proxy
Requires-Dist: redis>=5.0.0; extra == "proxy"
Requires-Dist: langfuse>=2.0.0; extra == "proxy"
Requires-Dist: uuid6>=2025.0.0; extra == "proxy"
Dynamic: license-file

# Eval Protocol

[![PyPI - Version](https://img.shields.io/pypi/v/eval-protocol)](https://pypi.org/project/eval-protocol/)
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/eval-protocol/python-sdk)

**Eval Protocol (EP) is an open solution for doing reinforcement learning fine-tuning on existing agents — across any language, container, or framework.**

![Eval Protocol overview](https://github.com/eval-protocol/python-sdk/raw/main/docs/intro.png)

Most teams already have complex agents running in production — often across remote services with heavy dependencies, Docker containers, or TypeScript backends deployed on Vercel. When they try to train or fine-tune these agents with reinforcement learning, connecting them to a trainer quickly becomes painful.

Eval Protocol makes this possible in two ways:

1. **Expose your agent through a simple API**
   Wrap your existing agent (Python, TypeScript, Docker, etc.) in a simple HTTP service using EP’s rollout interface. EP handles the rollout orchestration, metadata passing, and trace storage automatically.
2. **Connect with any trainer**
   Once your agent speaks the EP standard, it can be fine-tuned or evaluated with any supported trainer — Fireworks RFT, TRL, Unsloth, or your own — with no environment rewrites.

The result: RL that works out-of-the-box for existing production agents.

## Who This Is For

- **Applied AI teams** adding RL to existing production agents.
- **Research engineers** experimenting with fine-tuning complex, multi-turn or tool-using agents.
- **MLOps teams** building reproducible, language-agnostic rollout pipelines.

## Quickstart

- See the Quickstart repository: [eval-protocol/quickstart](https://github.com/eval-protocol/quickstart/tree/main)

## Resources

- **[Documentation](https://evalprotocol.io)** – Guides and API reference
- **[Discord](https://discord.com/channels/1137072072808472616/1400975572405850155)** – Community
- **[GitHub](https://github.com/eval-protocol/python-sdk)** – Source and examples

## License

[MIT](LICENSE)
