Metadata-Version: 2.4
Name: training-hub
Version: 0.4.0
Summary: An algorithm-focused interface for common language model training, continual learning, and reinforcement learning techniques
License-Expression: Apache-2.0
Project-URL: homepage, https://ai-innovation.team/
Project-URL: source, https://github.com/Red-Hat-AI-Innovation-Team/training_hub
Project-URL: issues, https://github.com/Red-Hat-AI-Innovation-Team/training_hub/issues
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: setuptools>=80.0
Requires-Dist: packaging>=24.2
Requires-Dist: wheel>=0.43
Requires-Dist: instructlab-training>=0.12.1
Requires-Dist: rhai-innovation-mini-trainer>=0.3.0
Requires-Dist: torch>=2.6.0
Requires-Dist: numba>=0.62.0
Requires-Dist: transformers>=4.57.0
Requires-Dist: datasets>=4.0.0
Requires-Dist: numpy>=1.26.4
Requires-Dist: rich>=14.1.0
Requires-Dist: peft>=0.15
Requires-Dist: pydantic>=2.7.0
Requires-Dist: aiofiles>=23.2.1
Requires-Dist: accelerate>=0.34.2
Requires-Dist: sympy>=1.0
Requires-Dist: networkx>=3.0
Requires-Dist: jinja2>=3.1.0
Requires-Dist: fsspec>=2025.0
Requires-Dist: pandas>=2.2
Requires-Dist: multiprocess>=0.70.16
Requires-Dist: aiohttp>=3.12
Requires-Dist: pyparsing>=3.0
Requires-Dist: regex>=2025.0
Requires-Dist: llvmlite>=0.42
Requires-Dist: filelock>=3.0
Requires-Dist: psutil>=6.0
Requires-Dist: urllib3>=2.4
Requires-Dist: frozenlist>=1.7
Requires-Dist: xxhash>=3.0
Requires-Dist: requests>=2.32.5
Requires-Dist: attr>=0.3.2
Requires-Dist: filelock>=3.19.1
Requires-Dist: mpmath>=1.3.0
Requires-Dist: pytest>=8.0
Provides-Extra: cuda
Requires-Dist: instructlab-training[cuda]>=0.12.1; extra == "cuda"
Requires-Dist: rhai-innovation-mini-trainer[cuda]>=0.3.0; extra == "cuda"
Requires-Dist: flash-attn>=2.8; extra == "cuda"
Requires-Dist: einops>=0.8; extra == "cuda"
Requires-Dist: kernels>=0.9.0; extra == "cuda"
Requires-Dist: bitsandbytes>=0.47.0; extra == "cuda"
Requires-Dist: liger-kernel>=0.5.10; extra == "cuda"
Requires-Dist: mamba-ssm[causal-conv1d]>=2.2.5; extra == "cuda"
Provides-Extra: lora
Requires-Dist: unsloth>=2025.10.11; extra == "lora"
Requires-Dist: trl>=0.18.0; extra == "lora"
Requires-Dist: xformers>=0.0.33.post1; extra == "lora"
Provides-Extra: dev
Requires-Dist: ipykernel; extra == "dev"
Requires-Dist: ipython; extra == "dev"
Dynamic: license-file

# Training Hub

**Training Hub** is an algorithm-focused interface for common LLM training, continual learning, and reinforcement learning techniques developed by the [Red Hat AI Innovation Team](https://ai-innovation.team).

<p align="center">
  <a href="https://pypi.org/project/training-hub/">
    <img src="https://img.shields.io/pypi/v/training-hub?style=for-the-badge" alt="PyPI version">
  </a>
  <a href="https://github.com/Red-Hat-AI-Innovation-Team/training_hub/blob/main/LICENSE">
    <img src="https://img.shields.io/github/license/Red-Hat-AI-Innovation-Team/training_hub?style=for-the-badge" alt="License">
  </a>
  <a href="https://ai-innovation.team/training_hub">
    <img src="https://img.shields.io/badge/📚_Documentation_(WIP)-blue?style=for-the-badge" alt="Documentation (in progress)">
  </a>
</p>

**New to Training Hub?** Read our comprehensive introduction: [Get Started with Language Model Post-Training Using Training Hub](https://developers.redhat.com/articles/2025/11/19/get-started-language-model-post-training-using-training-hub)

## Support Matrix

| Algorithm | InstructLab-Training | RHAI Innovation Mini-Trainer | PEFT | Unsloth | VERL | Status |
|-----------|----------------------|------------------------------|------|---------|------|--------|
| **Supervised Fine-tuning (SFT)** | ✅ | - | - | - | - | Implemented |
| Continual Learning (OSFT) | 🔄 | ✅ | 🔄 | - | - | Implemented |
| **Low-Rank Adaptation (LoRA) + SFT** | - | - | - | ✅ | - | Implemented |
| Direct Preference Optimization (DPO) | - | - | - | - | 🔄 | Planned |
| Group Relative Policy Optimization (GRPO) | - | - | - | - | 🔄 | Planned |

**Legend:**
- ✅ Implemented and tested
- 🔄 Planned for future implementation
- \- Not applicable or not planned

## Implemented Algorithms

### [Supervised Fine-tuning (SFT)](./algorithms/sft)

Fine-tune language models on supervised datasets with support for:
- Single-node and multi-node distributed training
- Configurable training parameters (epochs, batch size, learning rate, etc.)
- InstructLab-Training backend integration

```python
from training_hub import sft

result = sft(
    model_path="Qwen/Qwen2.5-1.5B-Instruct",
    data_path="/path/to/data",
    ckpt_output_dir="/path/to/checkpoints",
    num_epochs=3,
    effective_batch_size=8,
    learning_rate=1e-5,
    max_seq_len=256,
    max_tokens_per_gpu=1024,
)
```

### [Orthogonal Subspace Fine-Tuning (OSFT)](./algorithms/osft)

OSFT allows you to fine-tune models while controlling how much of its
existing behavior to preserve. Currently we have support for:

- Single-node and multi-node distributed training
- Configurable training parameters (epochs, batch size, learning rate, etc.)
- RHAI Innovation Mini-Trainer backend integration

Here's a quick and minimal way to get started with OSFT:

```python
from training_hub import osft

result = osft(
    model_path="/path/to/model",
    data_path="/path/to/data.jsonl", 
    ckpt_output_dir="/path/to/outputs",
    unfreeze_rank_ratio=0.25,
    effective_batch_size=16,
    max_tokens_per_gpu=2048,
    max_seq_len=1024,
    learning_rate=5e-6,
)
```

### [Low-Rank Adaptation (LoRA) + SFT](./algorithms/lora)


Parameter-efficient fine-tuning using LoRA with supervised fine-tuning. Features:
- Memory-efficient training with significantly reduced VRAM requirements
- Single-GPU and multi-GPU distributed training support
- Unsloth backend for 2x faster training and 70% less memory usage
- Support for QLoRA (4-bit quantization) for even lower memory usage
- Compatible with messages and Alpaca dataset formats

```python
from training_hub import lora_sft

result = lora_sft(
    model_path="Qwen/Qwen2.5-1.5B-Instruct",
    data_path="/path/to/data.jsonl",
    ckpt_output_dir="/path/to/outputs",
    lora_r=16,
    lora_alpha=32,
    num_epochs=3,
    learning_rate=2e-4
)
```


## Installation

### Basic Installation

This installs the base package, but doesn't install the CUDA-related dependencies which are required for GPU training.

```bash
pip install training-hub
```

### Development Installation
```bash
git clone https://github.com/Red-Hat-AI-Innovation-Team/training_hub
cd training_hub
pip install -e .
```

**For developers:** See the [Development Guide](./DEVELOPING.md) for detailed instructions on setting up your development environment, running local documentation, and contributing to Training Hub.


### LoRA Support
For LoRA training with optimized dependencies:
```bash
pip install training-hub[lora]
# or for development
pip install -e .[lora]
```

**Note:** The LoRA extras include Unsloth optimizations and PyTorch-optimized xformers for better performance and compatibility.

### CUDA Support
For GPU training with CUDA support:
```bash
pip install training-hub[cuda] --no-build-isolation
# or for development
pip install -e .[cuda] --no-build-isolation
```

**Note:** If you encounter build issues with flash-attn, install the base package first:
```bash
# Install base package (provides torch, packaging, wheel, ninja)
pip install training-hub
# Then install with CUDA extras
pip install training-hub[cuda] --no-build-isolation

# For development installation:
pip install -e . && pip install -e .[cuda] --no-build-isolation
```

If you're using uv, you can use the following commands to install the package:

```bash
# Installs training-hub from PyPI
uv pip install training-hub && uv pip install training-hub[cuda] --no-build-isolation

# For development:
git clone https://github.com/Red-Hat-AI-Innovation-Team/training_hub
cd training_hub
uv pip install -e . && uv pip install -e .[cuda] --no-build-isolation
```

## Getting Started

For comprehensive tutorials, examples, and documentation, see the [examples directory](./examples/).
