Metadata-Version: 2.4
Name: rodin
Version: 1.9.11
Summary: A comprehensive toolkit for processing and analyzing metabolomics data.
Home-page: https://github.com/BM-Boris/rodin
Author: Boris Minasenko
Author-email: boris.minasenko@emory.edu
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy>=1.21.4
Requires-Dist: pandas>=1.3.4
Requires-Dist: scipy>=1.7.3
Requires-Dist: scikit-learn>=1.0
Requires-Dist: umap-learn>=0.5.1
Requires-Dist: matplotlib>=3.5.0
Requires-Dist: seaborn>=0.11.2
Requires-Dist: statsmodels>=0.13.0
Requires-Dist: tqdm>=4.62.3
Requires-Dist: dash-bio>=0.8.0
Requires-Dist: dash>=2.7.0
Requires-Dist: pickle-mixin>=1.0.2
Requires-Dist: networkx>=2.6
Requires-Dist: plotly>=5.19.0
Requires-Dist: fastcluster
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

## **Rodin: Metabolomics Data Analysis Toolkit**
[![DOI](https://img.shields.io/badge/DOI-10.1093%2Fbioadv%2Fvbaf088-blue.svg)](https://doi.org/10.1093/bioadv/vbaf088)

_Rodin_ is a Python library specifically designed for the comprehensive processing and analysis of metabolomics data and other omics data. It is a class-methods based toolkit, facilitating a range of tasks from basic data manipulation to advanced statistical evaluations, visualization, and metabolic pathway analysis. 

Now, most of its functionality is available in the Web App at https://rodin-meta.com.

### **Features**

- **Efficient Data Handling**: Streamlined manipulation and transformation of metabolomics data and other omics.
- **Robust Statistical Analysis**: Includes ANOVA, t-tests, and more.
- **Machine Learning Methods**: Random Forest, Logistic and Linear regressions.
- **Advanced Dimensionality Reduction**: Techniques like PCA, t-SNE, UMAP.
- **Interactive Data Visualization**: Tools for effective data visualization.
- **Pathway Analysis**: Features for metabolic pathway analysis.

### **Installation**

We recommend installing Rodin in a separate environment for effective dependency management.

#### Prerequisites

- Python (3.10 or higher)

#### Install Rodin
```bash
pip install rodin
```

or install Rodin directly from GitHub:
```bash
pip install git+https://github.com/BM-Boris/rodin.git
```

#### Basic Example

Here's a basic example demonstrating the usage of Rodin for data analysis. Comprehensive Jupyter notebook guides can be found in the 'guides' folder
```python
import rodin

# Assume 'features.csv' and 'class_labels.csv' are your datasets
features_path = 'path/to/features.csv'
classes_path = 'path/to/class_labels.csv'

# Creating an instance of Rodin_Class
rodin_instance = rodin.create(features_path, classes_path)

# Transform the data (imputation, normalization, and log-transformation steps)
rodin_instance.transform()

# Run t-test comparing two groups based on 'age'
rodin_instance.ttest('age')

# Run two-way anova test comparing groups based on 'age' and 'region'
rodin_instance.twoway_anova(['age','region'])

# Run multiple logistic regressions and linear regressions to get pvalues for each feature
rodin_instance.sf_lg('sex')
rodin_instance.sf_lr('age')

#Run a random forest classifier and regressor to obtain the metrics of the trained model using k-fold validation, with assigned feature importance scores to each variable
rodin_instance.rf_class('region')
rodin_instance.rf_regress('age')

#Slice the whole object using the pattern from pandas
rodin_instance = rodin_instance[rodin_instance.features[rodin_instance.features['imp(rf) age']>0]]

# Perform PCA with 2 principal components (UMAP and t-SNE are available as well)
rodin_instance.run_pca(n_components=2)

# Plotting the PCA results
# 'region' column in the 'samples' DataFrame is used for coloring the points
rodin_instance.plot(dr_name='pca', hue='region', title='PCA Plot')
# Volcano Plot
rodin_instance.volcano(p='p_adj(owa) region', effect_size='lfc (New York vs Georgia)', sign_line=0.01)
# Box Plot
rodin_instance.boxplot(rows=[9999,4561], hue='region')
# Clustergram 
rodin_instance.clustergram(hue='sex',standardize='row')

# Pathway analysis 
rodin_instance.analyze_pathways(pvals='p_value', stats='statistic',mode='positive')
# Replace 'p_value' and 'statistic' with the actual column names in your 'features' DataFrame(rodin_instance.features) and provide Mass spectrometry analysis mode.
```
The updated guide can be accessed here: https://bm-boris.github.io/rodin_guide/basics.html. Test data from the guide can be found at https://github.com/BM-Boris/rodin_guide/tree/main/data.

#### Contact
For questions, suggestions, or feedback, please contact boris.minasenko@emory.edu

### Citation

If you use **Rodin** in your research, please cite the following paper:

Minasenko B, Wang D, Cirillo P, Krigbaum N, Cohn B, Jones DP, Collins JM, Hu X.  
*Rodin: a streamlined metabolomics data analysis and visualization tool.* **Bioinformatics Advances**. 2025; 5(1): vbaf088.  
https://doi.org/10.1093/bioadv/vbaf088

