Metadata-Version: 2.4
Name: medilinda-ml
Version: 0.1.9
Summary: Add your description here
Project-URL: Homepage, https://github.com/kraigochieng/medilinda
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: build>=1.3.0
Requires-Dist: faker>=37.8.0
Requires-Dist: imbalanced-learn>=0.14.0
Requires-Dist: ipykernel>=6.30.1
Requires-Dist: matplotlib>=3.10.6
Requires-Dist: mlflow>=3.4.0
Requires-Dist: numpy>=2.3.3
Requires-Dist: pandas>=2.3.3
Requires-Dist: pydantic-settings>=2.11.0
Requires-Dist: python-dotenv>=1.1.1
Requires-Dist: scikit-learn>=1.7.2
Requires-Dist: seaborn>=0.13.2
Requires-Dist: shap>=0.48.0
Requires-Dist: twine>=6.2.0

# Medilinda-ML 💊

[![PyPI version](https://badge.fury.io/py/medilinda-ml.svg)](https://badge.fury.io/py/medilinda-ml)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
[![Python Version](https://img.shields.io/pypi/pyversions/medilinda-ml)](https://pypi.org/project/medilinda-ml/)

A complete machine learning pipeline to predict the causality of Adverse Drug Reactions (ADRs) from patient and medication data. This package provides tools for data preprocessing, feature engineering, model training, and evaluation, with seamless MLflow integration for experiment tracking.

## Overview

The goal of Medilinda-ML is to provide a reproducible and easy-to-use system for assessing the likelihood that a suspected drug is the cause of an adverse reaction. The pipeline is built with `scikit-learn` and handles common challenges in clinical data, such as missing values and class imbalance (using SMOTE).

## Features

-   **End-to-End Pipeline**: From raw data to a trained model.
-   **Feature Engineering**: Automatically calculates features like patient BMI, drug administration duration, and more.
-   **Class Imbalance Handling**: Uses SMOTE to create a balanced dataset for training.
-   **Hyperparameter Tuning**: Leverages `RandomizedSearchCV` to find the best model configuration.
-   **Experiment Tracking**: Integrated with MLflow to log parameters, metrics, and models.

## Installation

Install Medilinda-ML directly from PyPI:

```bash
pip install medilinda-ml
```

## License

This project is licensed under the MIT License. See the `LICENSE` file for more details.
