Metadata-Version: 2.4
Name: netmedex
Version: 0.3.0
Summary: A tool to extract BioConcept entities (e.g., genes, diseases, chemicals, and species) from Pubtator3 and generate a co-mention network for interactive use.
Author-email: Zheng-Xiang Ye <r12b48005@ntu.edu.tw>
Classifier: Development Status :: 5 - Production/Stable
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.11
Description-Content-Type: text/markdown
Requires-Dist: requests
Requires-Dist: aiohttp
Requires-Dist: aiometer
Requires-Dist: tenacity
Requires-Dist: tqdm
Requires-Dist: networkx[default]~=3.3
Requires-Dist: lxml
Requires-Dist: python-dotenv
Requires-Dist: dash[diskcache]~=2.17
Requires-Dist: dash-cytoscape~=1.0.2
Requires-Dist: dash-bootstrap_components~=1.7.1
Provides-Extra: dev
Requires-Dist: pytest~=8.3.2; extra == "dev"
Requires-Dist: pytest-xdist; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: requests-mock; extra == "dev"
Requires-Dist: pytest-mock; extra == "dev"
Requires-Dist: ruff==0.8.0; extra == "dev"
Requires-Dist: pre-commit; extra == "dev"
Requires-Dist: mkdocs; extra == "dev"

# NetMedEx

[![Python package](https://img.shields.io/pypi/v/netmedex)](https://pypi.org/project/netmedex/)
[![Doc](https://img.shields.io/badge/Doc-online)](https://yehzx.github.io/NetMedEx/)

NetMedEx is a Python-based tool designed to extract BioConcept entities (e.g., genes, diseases, chemicals, and species) from Pubtator files generated by [Pubtator3](https://www.ncbi.nlm.nih.gov/research/pubtator3/). It calculates the frequency of BioConcept pairs (e.g., gene-gene, gene-chemical, chemical-disease) based on co-mentions in publications and generates a co-mention interaction network. These networks can be viewed in a browser or imported into [Cytoscape](https://cytoscape.org/) for advanced visualization and analysis.

## Getting Started

NetMedEx offers four ways for users to interact with the tool:

1. [Web Application (via Docker)](#web-application-via-docker)
2. [Web Application (Local)](#web-application-local)
3. [Command-Line Interface (CLI)](#command-line-interface-cli)
4. [Python API](#package-api)

For additional details, refer to the [Documentation](https://yehzx.github.io/NetMedEx/).

## Web Application (via Docker)

If you have [Docker](https://www.docker.com/) installed on your machine, you can run the following command to launch the web application using Docker, then open `localhost:8050` in your browser:

```bash
docker run -p 8050:8050 --rm lsbnb/netmedex
```

## Installation

Install NetMedEx from PyPI to use the web application locally or access the CLI:

```bash
pip install netmedex
```

_We recommend using Python version >= 3.11 for NetMedEx._

## Web Application (Local)

After installing NetMedEx, run the following command and open `localhost:8050` in your browser:

```bash
netmedex run
```

The sidebar parameters are detailed in the [Available Commands](#available-commands) section and [Documentation](https://yehzx.github.io/NetMedEx/).

## Command-Line Interface (CLI)

To generate a network, run `netmedex search` first to retrieve relevant articles and then run `netmedex network` to generate the network.

#### Search PubMed Articles

Use the CLI to search articles containing specific biological concepts via the [PubTator3 API](https://www.ncbi.nlm.nih.gov/research/pubtator3/api):

```bash
# Query with keywords and sort articles by relevance (default: recency)
netmedex search -q '"N-dimethylnitrosamine" AND "Metformin"' [-o OUTPUT_FILEPATH] --sort score

# Query with article PMIDs
netmedex search -p 34895069,35883435,34205807 [-o OUTPUT_FILEPATH]

# Query with article PMIDs (from file)
netmedex search -f examples/pmids.txt [-o OUTPUT_FILEPATH]

# Query with PubTator3 Entity ID and limit the number of articles to 100
netmedex search -q '"@DISEASE_COVID_19" AND "@GENE_PON1"' [-o OUTPUT_FILEPATH] --max_articles 100
```

_Note: Use double quotes for keywords containing spaces and logical operators (e.g., AND/OR) to combine keywords._

Available commands are detailed in [Search Command](#search-command).

#### Generate Co-Mention Networks

The PubTator file outputs from `netmedex search` is used to generate the network.

```bash
# Use default parameters and set edge weight cutoff to 1
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 1

# Keep MeSH terms and discard non-MeSH terms
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 1 --node_type mesh

# Keep confident relations between entities
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 1 --node_type relation

# Save the result in XGMML format for Cytoscape
netmedex network -i examples/pmids_output.pubtator -o pmids_output.xgmml -w 1 -f xgmml

# Use normalized pointwise mutual information (NPMI) to weight edges
netmedex network -i examples/pmids_output.pubtator -o pmids_output.html -w 5 --weighting_method npmi
```

Available commands are detailed in [Network Command](#network-command).

#### View the Network

- **HTML Output**: Open in a browser to view the network.
- **XGMML Output**: Import into Cytoscape for further analysis.

Refer to the [Documentation](https://yehzx.github.io/NetMedEx/) for more details.

## Available Commands

### General

```bash
usage: netmedex [-h] {search,network,run} ...

positional arguments:
  {search,network,run}
    search              Search PubMed articles and obtain annotations
    network             Build a network from annotations
    run                 Run NetMedEx app

options:
  -h, --help            Show this help message and exit
```

### Search Command

```bash
usage: netmedex search [-h] [-q QUERY] [-o OUTPUT] [-p PMIDS] [-f PMID_FILE] [-s {score,date}] [--max_articles MAX_ARTICLES] [--full_text]
                       [--use_mesh] [--debug]

options:
  -h, --help            show this help message and exit
  -q QUERY, --query QUERY
                        Query string
  -o OUTPUT, --output OUTPUT
                        Output path (default: [CURRENT_DIR].pubtator)
  -p PMIDS, --pmids PMIDS
                        PMIDs for the articles (comma-separated)
  -f PMID_FILE, --pmid_file PMID_FILE
                        Filepath to load PMIDs (one per line)
  -s {score,date}, --sort {score,date}
                        Sort articles in descending order by (default: date)
  --max_articles MAX_ARTICLES
                        Maximal articles to request from the searching result (default: 1000)
  --full_text           Collect full-text annotations if available
  --use_mesh            Use MeSH vocabulary instead of the most commonly used original text in articles
  --debug               Print debug information
```

### Network Command

```bash
usage: netmedex network [-h] [-i INPUT] [-o OUTPUT] [-w CUT_WEIGHT] [-f {xgmml,html,json}] [--node_type {all,mesh,relation}]
                        [--weighting_method {freq,npmi}] [--pmid_weight PMID_WEIGHT] [--debug] [--community] [--max_edges MAX_EDGES]

options:
  -h, --help            show this help message and exit
  -i INPUT, --input INPUT
                        Path to the pubtator file
  -o OUTPUT, --output OUTPUT
                        Output path (default: [INPUT_DIR].[FORMAT_EXT])
  -w CUT_WEIGHT, --cut_weight CUT_WEIGHT
                        Discard the edges with weight smaller than the specified value (default: 2)
  -f {xgmml,html,json,pickle}, --format {xgmml,html,json,pickle}
                        Output format (default: html)
  --node_type {all,mesh,relation}
                        Keep specific types of nodes (default: all)
  --weighting_method {freq,npmi}
                        Weighting method for network edge (default: freq)
  --pmid_weight PMID_WEIGHT
                        CSV file for the weight of the edge from a PMID (default: 1)
  --debug               Print debug information
  --community           Divide nodes into distinct communities by the Louvain method
  --max_edges MAX_EDGES
                        Maximum number of edges to display (default: 0, no limit)
```

## Package API

In addition to the web interface and CLI, NetMedEx can be used programmatically as a Python library. This allows for more flexible integration into custom pipelines and analysis workflows.

Example usage is available in `notebooks/netmedex_usage.ipynb`.
