# datawrapper-mcp Project Rules

## ⚠️ Critical: For AI Assistants Using This MCP Server

**THIS MCP SERVER IS THE DATAWRAPPER INTEGRATION**

When users ask to create Datawrapper charts, you MUST:
- ✅ Use the `create_chart` MCP tool from this server
- ✅ Use `get_chart_schema` to explore configuration options
- ✅ Use `update_chart`, `publish_chart`, and other MCP tools as needed

You MUST NOT:
- ❌ Install the 'datawrapper' Python package
- ❌ Use the Datawrapper API directly
- ❌ Import 'from datawrapper import ...'
- ❌ Run `pip install datawrapper`

This server handles ALL Datawrapper operations internally. The MCP tools are the ONLY interface you should use for Datawrapper chart creation and management.

## Project Overview

This is a Model Context Protocol (MCP) server that provides tools for creating Datawrapper charts through AI assistants. The server leverages Pydantic models from the datawrapper library for validation and schema generation.

## Architecture

### Two-Tier Tool Design

The server uses a handler/tool separation pattern:

1. **Handlers** (in `datawrapper_mcp/handlers/`): Contain business logic
   - Return `list[TextContent]` or `list[TextContent | ImageContent]`
   - Handle Datawrapper API interactions
   - Perform Pydantic validation
   - Format responses with structured data

2. **Tools** (in `datawrapper_mcp/server.py`): Thin MCP wrappers
   - Return `Sequence[TextContent | ImageContent]`
   - Pass through handler results directly
   - Wrap exceptions in TextContent for error handling
   - Provide MCP protocol interface

### Key Tools

1. **Chart Creation** (`create_chart`): Full control using Pydantic models
   - Takes data, chart_type, and complete chart_config dict
   - Validates config using Pydantic's model_validate()
   - Allows specification of all chart properties (colors, axes, labels, etc.)
   - Uses Pydantic class instance method: `chart.create(access_token)`

2. **Schema Discovery** (`get_chart_schema`): Explore available options
   - Returns Pydantic JSON schema for a chart type
   - Shows all properties, types, defaults, and descriptions
   - Enables AI to discover chart capabilities dynamically

### Key Design Decisions

- **Pydantic-Only**: Uses ONLY Pydantic class instance methods, never base Datawrapper client
- **No Deprecated Methods**: Avoids dw.create_chart(), dw.publish_chart(), etc. which will be deprecated
- **Instance Methods**: Uses chart.create(), chart.publish(), chart.update(), chart.delete()
- **Factory Pattern**: Uses get_chart(chart_id, access_token) from datawrapper.chart_factory
- **Type-Safe**: All configurations validated through Pydantic
- **Flexible Data**: Supports JSON strings, list of dicts, or dict of arrays

## File Structure

```
datawrapper_mcp/
├── __init__.py          # Package initialization
├── server.py            # Main MCP server implementation
├── config.py            # Configuration and chart type mappings
├── utils.py             # Utility functions (data conversion, API token)
└── handlers/            # Handler modules for each operation
    ├── __init__.py
    ├── create.py        # Chart creation handler
    ├── update.py        # Chart update handler
    ├── delete.py        # Chart deletion handler
    ├── publish.py       # Chart publishing handler
    ├── export.py        # Chart export handler
    ├── retrieve.py      # Chart retrieval handler
    └── schema.py        # Schema retrieval handler
```

## Dependencies

- `datawrapper>=2.0.7`: Python wrapper for Datawrapper API with Pydantic models
- `mcp[cli]>=1.20.0`: Model Context Protocol SDK
- `pandas>=2.0.0`: Data manipulation for chart data

## Environment Variables

- `DATAWRAPPER_ACCESS_TOKEN`: Required. Get from https://app.datawrapper.de/account/api-tokens

## Supported Chart Types

The server supports 8 chart types from the datawrapper library:
- `bar`: BarChart
- `line`: LineChart
- `area`: AreaChart
- `arrow`: ArrowChart
- `column`: ColumnChart
- `multiple_column`: MultipleColumnChart
- `scatter`: ScatterPlot
- `stacked_bar`: StackedBarChart

## Available Tools

1. **create_chart**: Chart creation with full Pydantic config
2. **get_chart_schema**: Get JSON schema for a chart type
3. **publish_chart**: Publish chart to make it public (only on explicit user request)
4. **get_chart**: Retrieve chart information
5. **update_chart**: Update chart data or configuration (Pydantic-validated)
6. **delete_chart**: Delete a chart
7. **export_chart_png**: Export chart as PNG image (only on explicit user request)

### Tool Usage Guidelines

**Publishing and Exporting**: The `publish_chart` and `export_chart_png` tools should ONLY be used when the user explicitly requests these actions. AI assistants should not automatically publish charts or export PNGs after chart creation unless specifically asked. This prevents unwanted automatic actions and gives users control over when charts are made public or exported.

## MCP Resources

- `datawrapper://chart-types`: Returns schemas for all chart types

## Data Format

The server accepts four data formats:

1. **List of records** (RECOMMENDED):
   ```json
   [{"year": 2020, "value": 100}, {"year": 2021, "value": 150}]
   ```

2. **Dict of arrays**:
   ```json
   {"year": [2020, 2021], "value": [100, 150]}
   ```

3. **JSON string**: String representation of formats 1 or 2

4. **File paths** (only for extremely large datasets):
   ```
   "/path/to/data.csv"
   "/path/to/data.json"
   ```

All formats are converted to pandas DataFrame internally.

### Important Data Guidelines for AI Assistants

**RECOMMENDED: Pass data inline as lists or dicts**
- Use list of records format for most use cases
- Use dict of arrays format when appropriate
- JSON strings are also supported

**What to pass:**
- Python data structures (lists, dicts) - PREFERRED
- JSON strings representing those structures
- File paths to CSV or JSON files (ONLY for extremely large datasets where inline data is impractical)

**What NOT to pass:**
- Raw CSV strings (e.g., "col1,col2\nval1,val2")
- File objects or file handles

**Handling Extremely Large Datasets:**
Only when inline data would be too large to pass directly:
1. Save the data to a temporary file (CSV or JSON format)
2. Pass the file path directly to create_chart or update_chart
3. The server will read and process the file automatically

Example workflow for extremely large datasets:
```python
# AI assistant saves extremely large dataset to file
import json
with open('/tmp/chart_data.json', 'w') as f:
    json.dump(large_dataset, f)

# Then pass the file path to the MCP tool
# data parameter: "/tmp/chart_data.json"
```

**CSV File Support:**
- CSV files are read directly using pandas.read_csv()
- First row is treated as column headers
- All standard CSV formats are supported

**JSON File Support:**
- JSON files must contain either:
  - List of dicts: `[{"col": val}, ...]`
  - Dict of arrays: `{"col": [vals]}`
- Null values are preserved as NaN in the DataFrame

**Error Handling:**
The `json_to_dataframe()` utility provides helpful error messages that:
- Accept file paths and read them automatically
- Detect CSV strings and suggest saving to file first
- Show examples of correct formats
- Validate data structure (non-empty, correct types)
- Support files with null/None values

## Implementation Notes

### Critical Implementation Rules

**NEVER use these deprecated methods:**
- `dw.create_chart()` - Use `chart.create(access_token)` instead
- `dw.publish_chart()` - Use `chart.publish(access_token)` instead
- `dw.add_data()` - Use `chart.data = df` then `chart.update(access_token)` instead
- `dw.update_metadata()` - Use `setattr(chart, key, value)` then `chart.update(access_token)` instead
- `dw.delete_chart()` - Use `chart.delete(access_token)` instead
- `dw.get_chart()` - Use `get_chart(chart_id, access_token)` from datawrapper.chart_factory instead

**Always use Pydantic class instance methods:**
- Create: `chart.create(access_token=api_token)`
- Publish: `chart.publish(access_token=api_token)`
- Update: `chart.update(access_token=api_token)`
- Delete: `chart.delete(access_token=api_token)`
- Retrieve: `get_chart(chart_id, access_token=api_token)` (factory function)

### Return Types and Error Handling

**All tools return `Sequence[TextContent | ImageContent]`:**
- Consistent return type across all tools
- Preserves structured data capabilities (annotations, audience, priority)
- Handlers return sequences, tools pass them through directly
- Exception handling wraps errors in TextContent

**Error Handling:**
- Missing API token raises ValueError with helpful message
- Invalid chart configs return TextContent with validation errors
- All tool calls wrapped in try/except to return TextContent with error messages
- Chart type mapping errors provide clear feedback on supported types

### Update Chart Implementation

The `update_chart` tool strictly enforces Pydantic validation:

1. **Retrieves existing chart** using `get_chart()` factory function (returns correct Pydantic class instance)
2. **Gets chart class directly** from instance using `type(chart)` - no manual type mapping needed
3. **Gets current config** using `chart.model_dump()`
4. **Merges new config** with existing config
5. **Validates through Pydantic** using `model_validate()` on the chart's class
6. **Updates chart attributes** from validated model
7. **Rejects low-level structures** like 'metadata' or 'visualize' - only accepts high-level Pydantic fields

This simplified approach:
- Eliminates manual type mapping dictionary (no exposure of low-level API types like "d3-bars-stacked")
- Uses the chart class directly from the `get_chart()` factory function
- Ensures chatbots cannot bypass the Pydantic abstraction layer and must use the intended API surface (title, intro, byline, source_name, etc.)

### Chart Creation Flow

1. Get API token from environment
2. Convert JSON data to pandas DataFrame
3. Get appropriate Pydantic chart class
4. Validate configuration using model_validate()
5. Set data on chart instance: `chart.data = df`
6. Create chart using instance method: `chart.create(access_token)`
7. Return chart ID and URLs using `chart.get_editor_url()`

### Pydantic Integration

- Uses `model_validate()` for config validation
- Uses `model_json_schema()` for schema generation
- Uses instance methods: `.create()`, `.publish()`, `.update()`, `.delete()`
- Uses factory function: `get_chart(chart_id, access_token)` for retrieval
- Uses helper methods: `.get_editor_url()`, `.get_public_url()`
- Provides helpful validation errors from Pydantic

### Type Annotations for Mypy

**Critical for mypy type checking:**
- `CHART_CLASSES` dictionary in `config.py` is annotated as `dict[str, type[Any]]`
- When retrieving chart classes from `CHART_CLASSES`, annotate the variable as `type[Any]`
- This tells mypy that these are class objects (not instances) with Pydantic methods
- Example: `chart_class: type[Any] = CHART_CLASSES[chart_type]`
- Required in: `config.py`, `handlers/schema.py`, `handlers/create.py`, `server.py`
- Without these annotations, mypy infers `ModelMetaclass` which lacks Pydantic method signatures

**Dependencies for mypy:**
- `pandas-stubs` must be installed for pandas type checking
- Added to `[project.optional-dependencies]` mypy section in `pyproject.toml`
- Install with: `uv pip install pandas-stubs` or `pip install pandas-stubs`

## Testing Considerations

To test the server:
1. Set DATAWRAPPER_ACCESS_TOKEN environment variable
2. Install package: `pip install -e .`
3. Run server: `datawrapper-mcp`
4. Configure MCP client to connect to server
5. Test with AI assistant or MCP inspector

## Future Enhancements

Potential improvements:
- Add more chart types as datawrapper library adds them
- Support for map visualizations
- Batch chart operations
- Chart templates/presets
- Data validation helpers
- Export options (PNG, SVG, PDF)

## Chart Styling Guidelines

When users ask about styling charts (colors, line widths, axes, tooltips, etc.):

1. **Always suggest `get_chart_schema` first**: This tool shows all available styling options, enum values, and defaults for the specific chart type
2. **Reference the documentation**: Point users to https://datawrapper.readthedocs.io/en/latest/ for detailed examples and patterns
3. **Use high-level Pydantic fields**: Never use low-level API structures like 'metadata' or 'visualize'
4. **Common styling patterns**:
   - Colors: `color_category={"column_name": "#hex_color"}`
   - Line styling: `lines=[{"column": "name", "width": "style1", "interpolation": "curved"}]`
   - Axis ranges: `custom_range_y=[min, max]`
   - Grid formatting: `y_grid_format="0"`, `x_grid="on"`
   - Tooltips: `tooltip_number_format="00.00"`
5. **Don't hardcode enum values**: Enum values (like LineWidth.THIN) are defined in the datawrapper library and may change. Always refer users to `get_chart_schema` for current values.

### Styling Workflow for AI Assistants

1. User asks about styling → Suggest `get_chart_schema` to explore options
2. Show relevant schema properties and their types
3. Reference documentation for detailed examples: https://datawrapper.readthedocs.io/en/latest/
4. Provide chart_config dict using high-level Pydantic fields
5. Use `create_chart` or `update_chart` with the validated config

## Common Issues

1. **Missing API token**: Set DATAWRAPPER_ACCESS_TOKEN environment variable
2. **Invalid chart config**: Use get_chart_schema to see valid options
3. **Data format errors**: Ensure data is JSON string, list of dicts, or dict of arrays
4. **Type errors**: Check Pydantic validation messages for required fields
5. **Styling questions**: Always start with get_chart_schema, then refer to documentation
6. **Mypy errors**: Ensure pandas-stubs is installed and type annotations use `type[Any]`
