Metadata-Version: 2.4
Name: checkamg
Version: 0.5.0
Summary: Automated identification and curation of Auxiliary Metabolic Genes (AMGs), Auxiliary Regulatory Genes (AReGs), and Auxiliary Physiology Genes (APGs) in viral genomes.
Author-email: "James C. Kosmopoulos" <kosmopoulos@wisc.edu>
License: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/AnantharamanLab/CheckAMG
Keywords: bioinformatics,metagenomics,viromics,genomics,AMG,phage
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: GNU General Public License v3 or later (GPLv3+)
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Requires-Python: <3.13,>=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: joblib>=1.5.1
Requires-Dist: lightgbm>=4.5.0
Requires-Dist: metapyrodigal>=1.4.1
Requires-Dist: numba>=0.61.2
Requires-Dist: numpy<2.3,>=1.24
Requires-Dist: pandas>=2.3.0
Requires-Dist: polars-u64-idx>=1.30.0
Requires-Dist: psutil>=7.0.0
Requires-Dist: pyarrow>=20.0.0
Requires-Dist: pyfastatools==2.5.0
Requires-Dist: pyhmmer==0.11.1
Requires-Dist: pyrodigal>=3.5.2
Requires-Dist: pyrodigal-gv>=0.3.2
Requires-Dist: pyyaml>=6.0
Requires-Dist: requests>=2.32
Requires-Dist: snakemake==8.23.2
Requires-Dist: tqdm>=4.67.1
Dynamic: license-file

# CheckAMG

**Automated curation of Auxiliary Metabolic Genes (AMGs), Auxiliary Regulatory Genes (AReGs), and Auxiliary Physiology Genes (APGs) in viral genomes.**

> ⚠️ **This tool is in active development and has not yet been peer-reviewed.**

## Quick Usage

```bash
checkamg download -d /path/to/db/destination

checkamg annotate \
  -d /path/to/db/destination \
  -g examples/example_data/single_contig_viruses.fasta \
  -vg examples/example_data/multi_contig_vMAGs \
  -o CheckAMG_example_out
```

## Features

* Input: nucleotide or protein sequences
* Handles single-contig viral genomes and multi-contig vMAGs
* Functional annotation + viral genome context-based curation
* Outputs curated lists and amino-acid sequences of AMGs, AReGs, and APGs

## Command-line Modules

```bash
checkamg -h
```

* `download`: Get required databases
* `annotate`: Predict and curate AVGs
* `de-novo`, `aggregate`, `end-to-end`: Coming soon

## Example Output

* FASTA files of predicted AVGs (by confidence and function class)
* Tabular summary of predictions (`final_results.tsv`, `gene_annotations.tsv`)

## License

GPL-3.0-or-later

**Example data and full documentation:**
[https://github.com/AnantharamanLab/CheckAMG](https://github.com/AnantharamanLab/CheckAMG)
