Metadata-Version: 2.4
Name: gbrl_gpu
Version: 1.1.1
Summary: Gradient Boosted Trees for RL
Author: Benjamin Fuhrer, Chen Tessler, Gal Dalal
Author-email: Benjamin Fuhrer <bfuhrer@nvidia.com>, Chen Tessler <ctessler@nvidia.com>, Gal Dalal <galal@nvidia.com>
License: NVIDIA Proprietary Software
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: License :: Other/Proprietary License
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
License-File: LICENSES.txt
Requires-Dist: pybind11[global]==2.13.1
Requires-Dist: numpy>=1.21.0
Requires-Dist: cmake>=3.18
Requires-Dist: torch>=1.13.1
Requires-Dist: scikit-learn>=1.5.0
Requires-Dist: scipy>=1.7.0
Requires-Dist: shap>=0.40.0
Requires-Dist: matplotlib>=3.5.0
Dynamic: author
Dynamic: license
Dynamic: license-file

# Gradient Boosting Reinforcement Learning (GBRL)
GBRL is a Python-based Gradient Boosting Trees (GBT) library, similar to popular packages such as [XGBoost](https://xgboost.readthedocs.io/en/stable/), [CatBoost](https://catboost.ai/), but specifically designed and optimized for reinforcement learning (RL). GBRL is implemented in C++/CUDA aimed to seamlessly integrate within popular RL libraries. 

[![License](https://img.shields.io/badge/license-NVIDIA-green.svg)](https://github.com/NVlabs/gbrl/blob/master/LICENSE)
[![PyPI version](https://badge.fury.io/py/gbrl.svg)](https://badge.fury.io/py/gbrl)

## Overview

GBRL adapts the power of Gradient Boosting Trees to the unique challenges of RL environments, including non-stationarity and the absence of predefined targets. The following diagram illustrates how GBRL uses gradient boosting trees in RL:

![GBRL Diagram](https://github.com/NVlabs/gbrl/raw/master/docs/images/gbrl_diagram.png)

GBRL features a shared tree-based structure for policy and value functions, significantly reducing memory and computational overhead, enabling it to tackle complex, high-dimensional RL problems.

## Key Features: 
- GBT Tailored for RL: GBRL adapts the power of Gradient Boosting Trees to the unique challenges of RL environments, including non-stationarity and the absence of predefined targets.
- Optimized Actor-Critic Architecture: GBRL features a shared tree-based structure for policy and value functions. This significantly reduces memory and computational overhead, enabling it to tackle complex, high-dimensional RL problems.
- Hardware Acceleration: GBRL leverages CUDA for hardware-accelerated computation, ensuring efficiency and speed.
- Seamless Integration: GBRL is designed for easy integration with popular RL libraries. We implemented GBT-based actor-critic algorithm implementations (A2C, PPO, and AWR) in stable_baselines3 [GBRL_SB3](https://github.com/NVlabs/gbrl_sb3). 

## Performance

The following results, obtained using the `GBRL_SB3` repository, demonstrate the performance of PPO with GBRL compared to neural-networks across various scenarios and environments:

![PPO GBRL results in stable_baselines3](https://github.com/NVlabs/gbrl/raw/master/docs/images/relative_ppo_performance.png)

## Getting started
### Dependencies
- Python 3.9 or higher

### Installation
GBRL provides pre-compiled binaries for easy installation. Choose **one** of the following options:

**CPU-only installation** (default):  
```pip install gbrl```

**GPU-enabled installation** (requires CUDA 12 runtime libraries):  
```pip install gbrl-gpu```

For further installation details and dependencies see the documentation. 

### Usage Example
For a detailed usage example, see `tutorial.ipynb`

## Current Supported Features
### Tree Fitting
- Greedy (Depth-wise) tree building - (CPU/GPU)  
- Oblivious (Symmetric) tree building - (CPU/GPU)  
- L2 split score - (CPU/GPU)  
- Cosine split score - (CPU/GPU) 
- Uniform based candidate generation - (CPU/GPU)
- Quantile based candidate generation - (CPU/GPU)
- Supervised learning fitting / Multi-iteration fitting - (CPU/GPU)
    - MultiRMSE loss (only)
- Categorical inputs
- Input feature weights - (CPU/GPU)
### GBT Inference
- SGD optimizer - (CPU/GPU)
- ADAM optimizer - (CPU only)
- Control Variates (gradient variance reduction technique) - (CPU only)
- Shared Tree for policy and value function - (CPU/GPU)
- Linear and constant learning rate scheduler - (CPU/GPU only constant)
- Support for up to two different optimizers (e.g, policy/value) - **(CPU/GPU if both are SGD)
- SHAP value calculation

# Documentation 
For comprehensive documentation, visit the [GBRL documentation](https://nvlabs.github.io/gbrl/).

# Citation
``` 
@article{gbrl,
  title={Gradient Boosting Reinforcement Learning},
  author={Benjamin Fuhrer, Chen Tessler, Gal Dalal},
  year={2024},
  eprint={2407.08250},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2407.08250}, 
}
```
# Licenses
Copyright © 2024, NVIDIA Corporation. All rights reserved.

This work is made available under the NVIDIA Source Code License-NC. Click [here](https://github.com/NVlabs/gbrl/blob/master/LICENSE). to view a copy of this license.

