Metadata-Version: 2.1
Name: dmt_learn
Version: 0.0.67
Summary: dmt learn package
Home-page: https://github.com/yourusername/my_package
Author: Zelin Zang
Author-email: zangzelin@westlake.edu.cn
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.10
Description-Content-Type: text/markdown
Requires-Dist: numpy>=1.21.0
Requires-Dist: plotly
Requires-Dist: scikit-learn
Requires-Dist: pandas
Requires-Dist: lightning
Requires-Dist: wandb
Requires-Dist: munkres
Requires-Dist: umap-learn
Requires-Dist: scanpy
Requires-Dist: kaleido
Requires-Dist: transformers

# DMT-HI: MOE-based Hyperbolic Interpretable Deep Manifold Transformation for Unspervised Dimensional Reduction

(Our Paper)[https://arxiv.org/abs/2410.19504]

The code includes the following modules:
* Datasets (Mnist, CIFAR-10, HCL, 20newsgroups)
* Training for DMT-HI
* Evaluation metrics 
* Visualisation
* Explainable Analyses

## Configurating python environment

We recommend using conda for configuration. You can refer to our `install-env.sh` to configure the environment.

```bash
conda create -n nml python=3.9
conda activate nml
bash install_env.sh
```

## Dataset

This project utilizes several datasets, including `20NG`, `HCL`, `MNIST`, and `CIFAR-10`. Please follow the instructions below to understand the dataset structure and usage.

### 1. 20NG Dataset
The `20NG` dataset is already included in this GitHub repository.

### 2. HCL Dataset
The `HCL` dataset must be manually downloaded from the following link: [Download HCL Dataset](https://gofile.me/7794C/rSolqImMJ). Once downloaded, please place the file `HCL60kafter-elis-all.h5ad` into the `data_path/` directory.

### 3. MNIST and CIFAR-10 Datasets
The `MNIST` and `CIFAR-10` datasets do not require manual download. These datasets will be automatically downloaded upon the first execution of the project.
Please ensure that you have a stable internet connection during the first run to automatically download these datasets.

## Run DMT-HI

You can run DMT-HI with a single line of code to get latent embedding.

### Minimun replication

Running minimal replication can be done with the following command:

```bash
python main.py fit -c=conf_new/nml4/mnist.yaml
```

## Analyses

After successfully running DMT-HI for the first time, you can use the built-in analyzer to further explore the results. Follow the steps below to configure and run the analyzer.

### Steps to Use the Analyzer

1. **Open `dash_main.py` File**:  
   Navigate to the file `dash_main.py` in the project directory.

2. **Modify the Model and Image Paths**:  
   In `dash_main.py`, update the following lines to match the model and image outputs generated by DMT-HI:
   - **Line 14**: Set the path to the saved model generated by DMT-HI.
   - **Line 17**: Set the path to the saved images generated during the model run.

3. **Run the Analyzer**:  
   After making the necessary changes, run the following command to start the analyzer:

   ```bash
   python dash_main.py
   ```

4. **Access the Analyzer**:  
   Once the script is running, it will return a local URL. Open the URL in your web browser to access the DMT-HI Analyzer.

Through this web-based interface, you can visualize latent embeddings.
