Metadata-Version: 2.1
Name: omnibusx-sdk
Version: 1.1.0
Summary: 
Author: Huy Nguyen
Author-email: huy@omnibusx.com
Requires-Python: >=3.9
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Dist: httpx (>=0.28.1,<0.29.0)
Requires-Dist: pandas (>=2.3.3,<3.0.0)
Requires-Dist: pydantic (>=2.11.7,<3.0.0)
Requires-Dist: tqdm (>=4.66.0,<5.0.0)
Description-Content-Type: text/markdown

# OmnibusX SDK

OmnibusX SDK is a Python package for submitting data programmatically to the OmnibusX Enterprise platform.

## Features

- Seamless integration with OmnibusX Enterprise APIs
- OAuth2 device flow authentication with token caching
- Chunked file upload with automatic retry logic
- Progress tracking for uploads
- Type-safe interfaces

## Installation

```bash
pip install omnibusx-sdk
```

## Quick Start

### Authentication

```python
from omnibusx_sdk import SDKClient

# Initialize the client
client = SDKClient(server_url="https://api-prod.omnibusx.com")

# Authenticate (opens browser for login)
client.authenticate()

# Test connection
client.test_connection()
```

### File Upload

Upload files to OmnibusX with automatic chunking and built-in progress tracking:

```python
from omnibusx_sdk import SDKClient

# Initialize and authenticate
client = SDKClient(server_url="https://api-prod.omnibusx.com")
client.authenticate()

# Get available groups to find your group_id
groups = client.get_available_groups()
group_id = groups[0].user_group_id  # or specify your group ID directly

# Upload files - progress is displayed automatically with a clean progress bar!
response = client.upload_files(
    file_paths=["/path/to/file1.h5", "/path/to/file2.csv"],
    group_id=group_id
)

# Output (live updating progress bar):
# [1/2] file1.h5:  67%|████████████████          | 67.5M/100M [00:05<00:02, 10.2MB/s]
# ✓ Upload complete! All 2 file(s) uploaded successfully.

print(f"Folder ID: {response.folder_id}")
```

**Features:**
- Automatic 5MB chunking for large files
- Clean progress bar showing uploaded size, total size, speed (MB/s), and ETA
- Retry logic with exponential backoff (up to 5 retries)
- Optional custom progress callback for additional handling
- Multiple file upload to the same folder
- Automatic inclusion of user email and group ID headers

**Advanced Usage:**

```python
# Silent upload (no progress display)
response = client.upload_files(file_paths, group_id=group_id, show_progress=False)

# Custom progress callback for additional handling
def log_progress(progress):
    # Log to file, update database, etc.
    if progress.done_chunks % 10 == 0:
        print(f"Checkpoint: {progress.done_chunks} chunks uploaded")

response = client.upload_files(file_paths, group_id=group_id, progress_callback=log_progress)
```

### Preprocessing Datasets

Preprocess datasets with type-safe, validated parameters:

#### Option 1: Server-side files (files already on server)

```python
from omnibusx_sdk import (
    SDKClient, PreprocessDatasetParams, BatchInfo,
    Species, SequencingTechnology, SequencingPlatform, DataFormat
)

# Initialize and authenticate
client = SDKClient(server_url="https://api-prod.omnibusx.com")
client.authenticate()

# Get available groups
groups = client.get_available_groups()
group_id = groups[0].user_group_id

# Create preprocessing parameters with SERVER paths
params = PreprocessDatasetParams(
    name="My Dataset",
    description="Dataset description",
    batches=[
        BatchInfo(
            file_path="/data/server/path/data.h5ad",  # Server path
            batch_name="Batch 1"
        )
    ],
    gene_reference_version=111,
    gene_reference_id=Species.HUMAN,  # or Species.MOUSE
    technology=SequencingTechnology.SC_RNA_SEQ,
    platform=SequencingPlatform.ScRnaSeq.CHROMIUM_10X,
    data_format=DataFormat.SCANPY,  # or DataFormat.SEURAT
)

# Submit preprocessing task
task_id = client.preprocess_dataset(params, group_id=group_id)

# Monitor task progress
client.get_task_info(task_id)
```

#### Option 2: Local files (upload + preprocess in one step)

```python
# Create preprocessing parameters with LOCAL paths
params = PreprocessDatasetParams(
    name="My Local Dataset",
    description="Dataset from local files",
    batches=[
        BatchInfo(
            file_path="/Users/me/data/sample1.h5ad",  # Local path!
            batch_name="Sample 1"
        ),
        BatchInfo(
            file_path="/Users/me/data/sample2.h5ad",  # Local path!
            batch_name="Sample 2"
        )
    ],
    gene_reference_version=111,
    gene_reference_id=Species.HUMAN,
    technology=SequencingTechnology.SC_RNA_SEQ,
    platform=SequencingPlatform.ScRnaSeq.CHROMIUM_10X,
    data_format=DataFormat.SCANPY,
)

# Upload files and preprocess in one step!
task_id = client.upload_and_preprocess_dataset(params, group_id=group_id)

# Output:
# Uploading 2 file(s)...
# [1/2] sample1.h5ad: 100%|████| 50.0M/50.0M [00:10<00:00, 5.0MB/s]
# ✓ Upload complete! All 2 file(s) uploaded successfully.
# Upload complete! Files uploaded to: /tmp/abc123/
#
# Submitting preprocessing task...
# Preprocessing task submitted! Task ID: task_xyz

# Monitor preprocessing progress
client.get_task_info(task_id)
```

**Supported Values:**
- **Species**: `Species.HUMAN`, `Species.MOUSE`
- **Technology**: `SequencingTechnology.SC_RNA_SEQ` (only sc_rna_seq for now)
- **Platforms**: `CHROMIUM_10X`, `CITE_SEQ`, `SMART_SEQ_2`, `DROP_SEQ`, `OTHERS`
- **Data Formats**: `DataFormat.SCANPY`, `DataFormat.SEURAT`

The SDK validates all parameters and provides clear error messages for invalid configurations.

### Working with Tasks

```python
# Get available user groups
groups = client.get_available_groups()
for group in groups:
    print(f"{group.name}: {group.description}")

# Import OmnibusX file
task_id = client.import_omnibusx_file(
    omnibusx_file_path="/path/to/file.omnibusx",
    group_id="your-group-id"
)

# Monitor task progress
task_info = client.get_task_info(task_id)
```

## Examples

See the example files for detailed usage:
- `example_simple_upload.py` - Quick start for file uploads
- `example_file_upload.py` - Advanced file upload scenarios with custom callbacks
- `example_preprocess_dataset.py` - Dataset preprocessing with type-safe parameters

## API Reference

### SDKClient

**Methods:**
- `authenticate(cache_token=True)` - Authenticate with OAuth2 device flow and extract user email
- `test_connection()` - Test API connectivity
- `upload_files(file_paths, group_id, progress_callback=None, show_progress=True)` - Upload files with chunking, automatic progress display, and required headers
- `get_available_groups()` - Get list of user groups
- `import_omnibusx_file(omnibusx_file_path, group_id)` - Import OmnibusX file
- `preprocess_dataset(params: PreprocessDatasetParams, group_id)` - Preprocess a dataset with server-side file paths
- `upload_and_preprocess_dataset(params: PreprocessDatasetParams, group_id, progress_callback=None, show_progress=True)` - Upload local files and preprocess in one step
- `get_task_info(task_id, interval=5)` - Monitor task progress

**Note:** The SDK automatically includes `OmnibusX-Email` (from Auth0 authentication) and `OmnibusX-GroupId` headers in all API requests.

### UploadProgress

**Fields:**
- `total_files` - Total number of files to upload
- `total_chunks` - Total number of chunks across all files
- `done_files` - Number of files completed
- `done_chunks` - Number of chunks completed
- `current_file` - Name of the file currently being uploaded

