Metadata-Version: 2.3
Name: magnify
Version: 0.12.4
Summary: A microscopy image processing toolkit.
Author: Karl Krauth, Xinxian Tian
Author-email: Karl Krauth <karl.krauth@gmail.com>, Xinxian Tian <cicitian887@gmail.com>
Requires-Dist: opencv-python-headless>=4.0
Requires-Dist: numpy>=1.22.0
Requires-Dist: scipy>=1.9.0
Requires-Dist: pandas>=2.3.3
Requires-Dist: plotly>=5.18.0
Requires-Dist: tifffile>=2021.11.2
Requires-Dist: dask-image>=2024.5.3
Requires-Dist: tqdm>=4.64
Requires-Dist: types-tqdm>=4.64
Requires-Dist: xarray[io]>=2025.10.0
Requires-Dist: dask[complete]>=2025.2.0
Requires-Dist: catalogue>=2.0.8
Requires-Dist: beautifulsoup4>=4.10.0
Requires-Dist: lxml>=5.0.0
Requires-Dist: confection>=0.0.4
Requires-Dist: scikit-learn>=1.2.0
Requires-Dist: numba>=0.58.1
Requires-Dist: matplotlib>=3.9.0
Requires-Dist: seaborn>=0.13.0
Requires-Dist: napari>=0.5.0
Requires-Dist: zarr>=3.0.0
Requires-Python: >=3.11
Description-Content-Type: text/markdown

# Magnify
A Python toolkit for processing microscopy images. Magnify makes it easy to work with terabyte-scale imaging datasets on your laptop. It provides a unified interface for any task that involves finding and processing markers of interest such as [**beads**](https://www.nature.com/articles/s41378-020-00220-3), [**droplets**](https://pubs.acs.org/doi/pdf/10.1021/acs.analchem.0c02499), **cells**, and [**microfluidic device components**](https://www.science.org/doi/full/10.1126/science.abf8761).

Magnify comes with predefined [processing pipelines](https://github.com/FordyceLab/magnify/blob/main/src/magnify/registry.py) that support **file-parsing**, **image stitching**, **flat-field correction**, **image segmentation**, **tag identification**, and **marker filtering** across many different marker types. Magnify's pipelines allow you to process your images in just a few lines of code, while also being easy to extend with your own custom pipeline components.

## Setup
```sh
pip install magnify
```

## Usage
Here's a minimal example of how to use magnify to find, analyze, and visualize lanthanide-encoded beads given a microscopy image.
```python
import magnify as mg
import magnify.plot as mp

# Process the experiment and get the output as an xarray dataset.
xp = mg.mrbles("example.ome.tif", search_channel="620", min_bead_radius=10)

# Get the mean bead area and intensity.
print("Mean Bead Area:", xp.fg.sum(dim=["roi_x", "roi_y"]).mean())
print("Mean Bead Intensity:", xp.where(xp.fg).roi.mean())

# Show all the beads and how they were segmented.
mp.imshow(xp)
```
![](static/imshow.gif)

## Core Concepts
### Output Format
Magnify outputs its results as xarray datasets. If you are unfamiliar with xarrays you might want to read [this quick overview](https://docs.xarray.dev/en/stable/getting-started-guide/quick-overview.html) after you're setup with magnify. An xarray dataset is essentially a dictionary of arrays with named dimensions. Let's look at the Jupyter notebook output for a simple example where we've only used magnify to segment beads.

![](static/xarray.png)

In most cases the actual data only consists of the processed images and regions of interest (ROI) around segmented markers. We also have coordinates which are arrays that represent metadata in our dataset, such as the location of the foreground (fg) and background (bg) in each ROI. The image below shows a graphical illustration of these concepts.
![](static/xarray-components.png)

In this example the image array was 2-dimensional (`image_height x image_width`) and the ROI array was 3-dimensional (`num_markers x roi_height x roi_width`). However, magnify can also process stacks of images that were collected across multiple timepoints and color channels, so the image array can have up to 4 dimensions (`num_timepoints x num_channels x ROI_height x ROI_width`) and the ROI array can have up to 5 dimensions (`num_markers x num_timepoints x num_channels x ROI_height x ROI_width`).

Also important for large datasets is how the data is stored. The `fg`, `bg`, `roi`, `image` arrays are stored on your hard drive rather than on RAM using [Dask](https://docs.dask.org/en/stable/presentations.html). This allows you to interact with much larger datasets at the cost of slower access times. You usually don't need to worry about this since Dask operates on subsets of the array in an intelligent way. But if you're finding that your analysis takes too long you might want to compute some summary information (e.g. the mean intensity of each marker) that fits in RAM, load that array into memory with [`compute`](https://docs.xarray.dev/en/stable/generated/xarray.DataArray.compute.html), and interact primarily with that array moving forward.

### File Parsing
Since a single experiment can consist of many files spread out across many folders, magnify allows you to retrieve many files using a single string. For example, let's say you've acquired an image across multiple channels stored in the following folder structure:
```text
.
├── egfp/
│   └── image1.tif
├── cy5/
│   └── image2.tif
└── dapi/
    └── image3.tif
```

You can load all these images into magnify with
```python
xp = pipe("(channel)/*.tif")
```
The search string supports [globs](https://en.wikipedia.org/wiki/Glob_(programming)) so `*` expands to match anything that matches the pattern. `(channel)` also expands like `*` but it also saves the segment of the file path it matches in the resulting dataset as the  channel name. The specifiers that allow you to read metadata from the file path are:
- `(assay)`: The name of distinct experiments, if this is provided magnify returns a list of datasets (rather than a single dataset).
- `(time)`: The time at which the image was acquired in format YYYYmmDD-HHMMSS. If your files specify acquisition times in a different format you can write `(time|FORMAT)` where `FORMAT` is a [legal format code for Python's strptime](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes) (e.g. `(time|%H:%M:%S)`).
- `(channel)`: The channel in which the image was acquired.
- `(row)` and `(col)`: In the case of a tiled image these two specifiers indicate the row and column of the subimages. Magnify will stitch all these tiles into one large image.
- Alternate coordinates: You can also attach additional information to each coordinate using a specifier that looks like: `(INFO_COORD)` where `COORD` is the name of the original coordinate and `INFO` is the name for the attached information for example, `(concentration_time)`. By default magnify encodes the information as strings but you can specify alternate formats using `(INFO__COORD|FORMAT)` where `FORMAT` can be: `time`, `int`, or `float`.

Magnify can read any [TIFF](https://en.wikipedia.org/wiki/TIFF) image file. It can also read [OME-TIFF](https://docs.openmicroscopy.org/ome-model/5.6.3/ome-tiff/) files that were generated by [micromanager](https://micro-manager.org/). We plan to add support for other input formats as needed.

### Pipelines
Magnify uses a component-based pipeline architecture inspired by [spaCy](https://spacy.io/). Each component receives an xarray dataset, performs its operation, and passes the result to the next component in the sequence.

For common tasks, use the predefined pipelines (`mg.microfluidic_chip()`, `mg.mrbles()`, `mg.beads()`) shown in the usage section. You can find all predefined pipelines and their implementations in the [registry.py](https://github.com/FordyceLab/magnify/blob/main/src/magnify/registry.py) file. For custom workflows, build your own:

```python
from magnify import Pipeline

pipe = Pipeline("read")
pipe.add_pipe("standardize_format")
pipe.add_pipe("flatfield_correct", flatfield=1.2, darkfield=0.1)
pipe.add_pipe("stitch", overlap=102)
pipe.add_pipe("find_beads", min_bead_diameter=15, max_bead_diameter=40)

result = pipe("my_images/*.tif")
```

Note that some components depend on others. For example most components assume `standardize_format` has already run.

Create custom components as functions:
```python
def enhance_contrast(xp, factor=2.0):
    xp["image"] = xp["image"] * factor
    return xp

pipe.add_pipe(enhance_contrast, factor=1.5)
```

Or register them for reuse:
```python
@mg.component("threshold")
def threshold(xp, value=0.5):
    xp["mask"] = xp["image"] > value
    return xp
```

Detection components add new variables to the dataset like `roi` (regions of interest), `fg` (foreground masks), and `bg` (background masks), which downstream components can use for further processing.

### Plotting
Magnify includes a plotting sublibrary which you can import with `import magnify.plot as mp`. It is designed to enable rapid prototyping of interactive visualization, primarily for the purpose of troubleshooting experiments rather than creating publication-ready figures. The plotting library is still under development and isn't currently stable.
