ryzenai-lite
==============

ryzenai-lite is a small, minimal ONNX-based inference helper library that
provides a clean public API to load ONNX models and run inference on CPU or
GPU (DirectML/CUDA/ROCm when available). The library is intentionally
minimal to make it easy to plug into small projects and demos.

Quick start
-----------

1. Inspect model inputs:

```powershell
python example/inspect_model.py
```

2. Run a small SLM generation (greedy) with GPU (DirectML) preferred and CPU fallback:

```powershell
python example/slm_run.py --model example/models/bert-tiny.onnx --prompt "Once upon a time" --max-new-tokens 16 --device dml
```

3. GPU test (verify DirectML usage):

```powershell
python example/gpu_simple_test.py
```

Public API
----------

- `load_model(model_path: str, device_preference: Optional[List[str]] = None) -> Model`
- `Model` class with methods: `warmup`, `run`, `run_once`, `dispose`
- `device_info()` -> dict of available providers and platform
- `benchmark_model(...)` helper

See `DOCUMENTATION.md` for full details, examples and troubleshooting tips.

License
-------
MIT

Generated on: 2025-11-01
