Metadata-Version: 2.4
Name: chemfunc
Version: 1.0.12
Summary: Useful functions and scripts for working with small molecules.
Project-URL: Homepage, https://github.com/swansonk14/chemfunc
Project-URL: Issues, https://github.com/swansonk14/chemfunc/issues
Author-email: Kyle Swanson <swansonk.14@gmail.com>
Maintainer-email: Kyle Swanson <swansonk.14@gmail.com>
License-Expression: MIT
License-File: LICENSE.txt
Keywords: computational chemistry
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Typing :: Typed
Requires-Python: >=3.10
Requires-Dist: descriptastorus>=2.7.0.5
Requires-Dist: matplotlib
Requires-Dist: numpy
Requires-Dist: pandas
Requires-Dist: rdkit
Requires-Dist: scikit-learn
Requires-Dist: tqdm>=4.66.3
Requires-Dist: typed-argument-parser>=1.10.1
Description-Content-Type: text/markdown

# Chem Func

[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/chemfunc)](https://badge.fury.io/py/chemfunc)
[![PyPI version](https://badge.fury.io/py/chemfunc.svg)](https://badge.fury.io/py/chemfunc)
[![Downloads](https://pepy.tech/badge/chemfunc)](https://pepy.tech/project/chemfunc)
[![license](https://img.shields.io/github/license/swansonk14/chemfunc.svg)](https://github.com/swansonk14/chemfunc/blob/main/src/LICENSE.txt)

Useful functions and scripts for working with small molecules.

## Installation

Optionally, create a conda environment.
```bash
conda create -y -n chemfunc python=3.13
conda activate chemfunc
```

Install the latest version of Chem Func using pip.
```
pip install chemfunc
```

Alternatively, clone the repository and install the local version of the package.
```
git clone https://github.com/swansonk14/chemfunc.git
cd chemfunc
pip install -e .
```

**Note:** If you get the issue `ImportError: libXrender.so.1: cannot open shared object file: No such file or directory`, run `conda install -c conda-forge xorg-libxrender`.


## Features

Chem Func contains a variety of useful functions and scripts for working with small molecules.

Functions can be imported from the `chemfunc` package. For example:
```python
from pathlib import Path
from chemfunc.sdf_to_smiles import sdf_to_smiles

sdf_to_smiles(
    data_path=Path('molecules.sdf'),
    save_path=Path('molecules.csv')
)
```

Most modules can also be run as scripts from the command line using the `chemfunc` command along with the appropriate function name. For example:
```bash
chemfunc sdf_to_smiles \
    --data_path molecules.sdf \
    --save_path molecules.csv
```

To see a list of available scripts, run `chemfunc -h`.

For each script, run `chemfunc <script_name> -h` to see a description of the arguments for that script.


## Contents

Below is a list of the contents of the package.

[`canonicalize_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/canonicalize_smiles.py) (function, script)

Canonicalizes SMILES using RDKit canonicalization and optionally strips salts.

[`chemical_diversity.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/chemical_diversity.py) (function, script)

Computes the chemical diversity of a set of molecules in terms of Tanimoto distances.

[`cluster_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/cluster_molecules.py) (function, script)

Performs k-means clustering to cluster molecules based on Morgan fingerprints.

[`compute_properties.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/compute_properties.py) (function, script)

Computes one or more molecular properties for a set of molecules.

[`convert_sdf.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/convert_sdf.py) (functions)

Functions to convert SDF files to SMILES or SMARTS. Used by `sdf_to_smiles` and `sdf_to_smarts`.

[`deduplicate_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/deduplicate_smiles.py) (function, script)

Deduplicate a CSV files by SMILES.

[`filter_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/filter_molecules.py) (function, script)

Filters molecules to those with values in a certain range.

[`measure_experimental_reproducibility.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/measure_experimental_reproducibility.py) (function, script)

Measures the experimental reproducibility of two biological replicates by using one replicate to predict the other.

[`molecular_fingerprints.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/molecular_fingerprints.py) (functions, script)

Contains functions to compute fingerprints for molecules. Parallelized for speed. The function `save_fingerprints` can be used as a script to compute fingerprints from a CSV file and save them as an NPZ file.

[`molecular_properties.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/molecular_properties.py) (functions)

Contains functions to compute molecular properties. Parallelized for speed.

[`molecular_similarities.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/molecular_similarities.py) (functions)

Contains functions to compute similarities between molecules. Parallelized for speed.

[`nearest_neighbor.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/nearest_neighbor.py) (function, script)

Given a dataset of molecules, computes the nearest neighbor molecule in a second dataset using one of several similarity metrics.

[`plot_property_distribution.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/plot_property_distribution.py) (function, script)

Plots the distribution of molecular properties of a set of molecules.

[`plot_tsne.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/plot_tsne.py) (function, script)

Runs a t-SNE on molecular fingerprints from one or more chemical libraries.

[`regression_to_classification.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/regression_to_classification.py) (function, script)

Converts regression data to classification data using given thresholds.

[`sample_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/sample_molecules.py) (function, script)

Samples molecules from a CSV file, either uniformly at random across the entire dataset or uniformly at random from each cluster within the data.

[`sdf_to_smarts.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/sdf_to_smarts.py) (function, script)

Converts an SDF file to a CSV file with SMARTS.

[`sdf_to_smiles.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/sdf_to_smiles.py) (function, script)

Converts an SDF file to a CSV file with SMILES.

[`select_from_clusters.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/select_from_clusters.py) (function, script)

Selects the best molecule from each cluster.

[`smiles_to_svg.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/smiles_to_svg.py) (function, script)

Converts a SMILES string to an SVG image of the molecule.

[`visualize_molecules.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/visualize_molecules.py)(function, script)

Converts a file of SMILES to images of molecular structures.

[`visualize_reactions.py`](https://github.com/swansonk14/chemfunc/blob/main/src/chemfunc/visualize_reactions.py) (function, script)

Converts a file of reaction SMARTS to images of chemical reactions.
