# PredictionData Python Client

A Python library for streaming historical market data from the PredictionData API. Access order books, trades, and on-chain fills for Polymarket prediction markets.

## Installation

```bash
pip install predictiondata
```

## Quick Start

```python
from predictiondata import PredictionDataClient, Channel

# Initialize client with your API key
client = PredictionDataClient(api_key="<YOUR_API_KEY>")

# Stream historical data
messages = client.replay(
    exchange="polymarket",
    from_date="2024-11-01",
    to_date="2024-11-15",
    filters=[Channel(name="trades", symbols=["will-trump-win-2024/YES"])]
)

    async for exchange_timestamp, message in messages:
        print(f"Time: {exchange_timestamp}ms, Trade: {message}")
```

## Features

- **Async streaming** - Efficiently stream large amounts of historical data
- **Multiple data types** - Access order books, trades, and on-chain fills
- **Flexible filtering** - Filter by market slug or token ID
- **Type-safe** - Full type hints for better IDE support

## Data Types

### Order Books

Incremental order book reconstructions with bid/ask prices and sizes.

```python
Channel(name="books", symbols=["will-trump-win-2024/YES"])
```

**Schema:**
- `exchange_timestamp` (int): Exchange timestamp in milliseconds
- `local_timestamp` (int): Server capture timestamp in milliseconds  
- `ask_prices` (str): Comma-separated ask prices
- `ask_sizes` (str): Comma-separated ask sizes
- `bid_prices` (str): Comma-separated bid prices
- `bid_sizes` (str): Comma-separated bid sizes

### Trades

Executed trades from the order book.

```python
Channel(name="trades", symbols=["will-trump-win-2024/YES"])
```

**Schema:**
- `exchange_timestamp` (int): Exchange timestamp in milliseconds
- `local_timestamp` (int): Server capture timestamp in milliseconds
- `side` (str): "BUY" or "SELL"
- `size` (float): Trade size
- `price` (float): Trade price

### On-chain Fills

On-chain settlement data from the Polygon blockchain.

```python
Channel(name="onchain_fills", symbols=["will-trump-win-2024/YES"])
```

**Schema:**
- `block_number` (int): Blockchain block number
- `block_timestamp` (int): Block timestamp in milliseconds
- `side` (str): "BUY" or "SELL"
- `size` (float): Fill size
- `price` (float): Fill price
- `maker` (str): Maker address
- `taker` (str): Taker address

## Usage Examples

### Stream Multiple Markets

```python
from predictiondata import PredictionDataClient, Channel

async def main():
    client = PredictionDataClient(api_key="your_api_key")
    
    messages = client.replay(
        exchange="polymarket",
        from_date="2024-11-01",
        to_date="2024-11-15",
        filters=[
            Channel(name="trades", symbols=[
                "will-trump-win-2024/YES",
                "will-biden-win-2024/YES"
            ])
        ]
    )
    
    async for exchange_timestamp, message in messages:
        print(f"Market: {message['_symbol']}")
        print(f"Side: {message['side']}, Size: {message['size']}, Price: {message['price']}")
    
    await client.close()

# Run with asyncio
import asyncio
asyncio.run(main())
```

### Use Token IDs Instead of Slugs

```python
Channel(name="onchain_fills", token_ids=["0x1234567890abcdef..."])
```

### Fetch Single Day

For non-streaming use cases, fetch a complete day of data:

```python
async def fetch_example():
    client = PredictionDataClient(api_key="your_api_key")
    
    data = await client.fetch_day(
        exchange="polymarket",
        data_type="trades",
        identifier="will-trump-win-2024/YES",
        date="2024-11-15"
    )
    
    print(f"Found {len(data)} trades")
    await client.close()
```

### Context Manager

Use async context manager for automatic cleanup:

```python
async def main():
    async with PredictionDataClient(api_key="your_api_key") as client:
        messages = client.replay(
            exchange="polymarket",
            from_date="2024-11-01",
            to_date="2024-11-15",
            filters=[Channel(name="books", symbols=["btc-above-100k/YES"])]
        )
        
        async for exchange_timestamp, message in messages:
            # Process messages
            pass
```

## API Reference

### PredictionDataClient

Main client class for accessing the PredictionData API.

**Constructor:**
```python
PredictionDataClient(api_key: str, base_url: str = "http://datasets.predictiondata.dev")
```

**Methods:**

- `replay(exchange, from_date, to_date, filters)` - Stream historical data
  - Returns: `AsyncIterator[Tuple[int, Dict[str, Any]]]` (yields exchange_timestamp, message)
  
- `fetch_day(exchange, data_type, identifier, date, use_slug=True)` - Fetch single day
  - Returns: `List[Dict[str, Any]]`
  
- `close()` - Close the client session

### Channel

Represents a data channel filter.

**Constructor:**
```python
Channel(name: str, symbols: List[str] = None, token_ids: List[str] = None)
```

- `name`: Data type - "books", "trades", or "onchain_fills"
- `symbols`: List of market slugs (format: "event-slug/OUTCOME")
- `token_ids`: List of token IDs (alternative to symbols)

## Market Identifiers

Markets can be identified by either:

1. **Slug** (format: `event-slug/OUTCOME`):
   - Example: `will-trump-win-2024/YES`
   - Use for human-readable queries
   
2. **Token ID** (contract address):
   - Example: `0x1234567890abcdef...`
   - Use for programmatic queries

## Error Handling

The client handles missing data gracefully:

```python
async for exchange_timestamp, message in client.replay(...):
    try:
        # Process message
        pass
    except Exception as e:
        print(f"Error processing message: {e}")
```

Missing data files (404 responses) are skipped automatically.

## Development

### Install Development Dependencies

```bash
pip install -e ".[dev]"
```

### Run Tests

```bash
pytest
```

### Format Code

```bash
black predictiondata/
```

## License

MIT License

## Support

- Documentation: https://predictiondata.dev/docs
- Issues: https://github.com/predictiondata/predictiondata_client/issues
- Email: support@predictiondata.dev

## Changelog

### 0.1.0 (2025-11-17)

- Initial release
- Support for books, trades, and on-chain fills
- Async streaming API
- Market filtering by slug or token ID

