Metadata-Version: 2.4
Name: pybandits
Version: 4.0.15
Summary: Python Multi-Armed Bandit Library
License: MIT
License-File: LICENSE
Keywords: multi-armed bandits,reinforcement-learning,optimization
Author: Dario d'Andrea
Author-email: dariod@playtika.com
Requires-Python: >=3.9,<3.13
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: bokeh (>=3.1,<4.0)
Requires-Dist: loguru (>=0.6,<0.7)
Requires-Dist: numpy (>=1.25) ; python_version >= "3.9" and python_version < "3.12"
Requires-Dist: numpy (>=1.26) ; python_version == "3.12"
Requires-Dist: optuna (>=3.6,<4.0)
Requires-Dist: pydantic (>=1.10,<3)
Requires-Dist: pymc (>=5.10,<6.0) ; python_version >= "3.9"
Requires-Dist: pymc (>=5.3,<6.0) ; python_version == "3.8"
Requires-Dist: scikit-learn (>=1.1,<2.0)
Requires-Dist: scipy (>=1.11,<1.13) ; python_version == "3.12"
Requires-Dist: scipy (>=1.9,<1.13) ; python_version >= "3.8" and python_version < "3.12"
Project-URL: Homepage, https://github.com/PlaytikaOSS/pybandits
Project-URL: Repository, https://github.com/PlaytikaOSS/pybandits
Description-Content-Type: text/markdown

PyBandits
=========

![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/PlaytikaOSS/pybandits/continuous_integration.yml)
![PyPI - Version](https://img.shields.io/pypi/v/pybandits)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/pybandits)
![MIT License](https://img.shields.io/badge/license-MIT-blue)
![Ask DeepWiki](https://deepwiki.com/badge.svg)
![Coverage](https://codecov.io/gh/PlaytikaOSS/pybandits/branch/develop/graph/badge.svg)

**PyBandits**  is a ``Python`` library for Multi-Armed Bandit. It provides an implementation of stochastic Multi-Armed Bandit (sMAB) and contextual Multi-Armed Bandit (cMAB) based on Thompson Sampling.

For the sMAB, we implemented a Bernoulli multi-armed bandit based on Thompson Sampling algorithm [Agrawal and Goyal, 2012](http://proceedings.mlr.press/v23/agrawal12/agrawal12.pdf). If context information is available we provide a generalisation of Thompson Sampling for cMAB [Agrawal and Goyal, 2014](https://arxiv.org/pdf/1209.3352.pdf) implemented with [PyMC](https://peerj.com/articles/cs-55/), an open source probabilistic programming framework for automatic Bayesian inference on user-defined probabilistic models.

Installation
------------

This library is distributed on [PyPI](https://pypi.org/project/pybandits/) and can be installed with ``pip``.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~bash
pip install pybandits
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Based on the guidelines of ``pymc`` authors, it is highly recommended to install the library in a conda environment via the following.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~bash
conda install -c conda-forge pymc
pip install pybandits
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The command above will automatically install all the dependencies listed in ``pyproject.toml``. Please visit the
[installation](https://playtikaoss.github.io/pybandits/installation.html)
page for more details.

Getting started
---------------

A short example, illustrating it use. Use the sMAB model to predict actions and update the model based on rewards from the environment.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~python
import numpy as np
from pybandits.model import Beta
from pybandits.smab import SmabBernoulli

n_samples=100

# define action model
actions = {
    "a1": Beta(),
    "a2": Beta(),
}

# init stochastic Multi-Armed Bandit model
smab = SmabBernoulli(actions=actions)

# predict actions
pred_actions, _ = smab.predict(n_samples=n_samples)
simulated_rewards = np.random.randint(2, size=n_samples)

# update model
smab.update(actions=pred_actions, rewards=simulated_rewards)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Documentation
-------------

For more information please read the full
[documentation](https://playtikaoss.github.io/pybandits/pybandits.html)
and
[tutorials](https://playtikaoss.github.io/pybandits/tutorials.html).

You can also observe on [DeepWiki](https://deepwiki.com/PlaytikaOSS/pybandits).

Info for developers
-------------------

The source code of the project is available on [GitHub](https://github.com/playtikaoss/pybandits).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~bash
git clone https://github.com/playtikaoss/pybandits.git
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

You can install the library and the dependencies from the source code with one of the following commands:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~bash
poetry install                # install library + dependencies
poetry install --without dev     # install library + dependencies, excluding developer-dependencies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To create the HTML documentation run the following commands:

~~~~~~~~~~~bash
cd docs/src
make html
~~~~~~~~~~~

Run tests
---------

Tests can be executed with ``pytest`` running the following commands. Make sure to have the library installed before to
run any tests.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~bash
cd tests
pytest -vv                                      # run all tests
pytest -vv test_testmodule.py                   # run all tests within a module
pytest -vv test_testmodule.py -k test_testname  # run only 1 test
pytest -vv -k 'not time'                        # run all tests but not exec time
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

License
-------

[MIT License](LICENSE)

