Metadata-Version: 2.4
Name: lobster-ai
Version: 0.3.2.1
Summary: Multi-Agent Bioinformatics Analysis System powered by LangGraph
Home-page: https://github.com/the-omics-os/lobster-ai
Author: Omics-OS
Author-email: Omics-OS <info@omics-os.com>
License-Expression: AGPL-3.0-or-later
Project-URL: Homepage, https://www.omics-os.com
Project-URL: Bug Tracker, https://github.com/the-omics-os/lobster/issues
Project-URL: Documentation, https://github.com/the-omics-os/lobster.wiki
Project-URL: Source Code, https://github.com/the-omics-os/lobster
Keywords: bioinformatics,RNA-seq,single-cell,AI,machine-learning,data-analysis,genomics
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Healthcare Industry
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Environment :: Console
Classifier: Framework :: FastAPI
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: pandas>=1.5.0
Requires-Dist: numpy>=1.23.0
Requires-Dist: plotly>=5.0.0
Requires-Dist: rich>=12.0.0
Requires-Dist: typer>=0.7.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: langchain>=0.1.0
Requires-Dist: langchain-community>=0.0.10
Requires-Dist: langchain-openai>=0.0.5
Requires-Dist: langchain-aws>=0.1.0
Requires-Dist: langgraph==0.6.7
Requires-Dist: openai>=1.0.0
Requires-Dist: llvmlite>=0.39.0
Requires-Dist: scipy>=1.10.0
Requires-Dist: scikit-learn>=1.3.0
Requires-Dist: anndata>=0.9.0
Requires-Dist: mudata>=0.2.0
Requires-Dist: biopython>=1.81
Requires-Dist: leidenalg>=0.9.0
Requires-Dist: igraph>=0.10.4
Requires-Dist: scrublet>=0.2.3
Requires-Dist: h5py>=3.9.0
Requires-Dist: tables>=3.8.0
Requires-Dist: statsmodels>=0.14.0
Requires-Dist: seaborn>=0.12.0
Requires-Dist: matplotlib>=3.7.0
Requires-Dist: kaleido>=0.2.0
Requires-Dist: openpyxl>=3.1.0
Requires-Dist: pyarrow>=12.0.0
Requires-Dist: pyreadr>=0.4.0
Requires-Dist: requests>=2.31.0
Requires-Dist: aiofiles>=23.0.0
Requires-Dist: xmltodict>=0.13.0
Requires-Dist: boto3>=1.26.0
Requires-Dist: pypdf2>=3.0.0
Requires-Dist: beautifulsoup4>=4.12.0
Requires-Dist: lxml>=4.9.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: uvicorn>=0.23.0
Requires-Dist: scikit-misc>=0.5.1
Requires-Dist: python-multipart>=0.0.20
Requires-Dist: GEOparse
Requires-Dist: packaging>=25.0
Requires-Dist: tabulate>=0.9.0
Requires-Dist: langfuse>=3.2.6
Requires-Dist: polars>=1.32.3
Requires-Dist: psutil>=7.0.0
Requires-Dist: scanpy>=1.11.4
Requires-Dist: pydeseq2>=0.5.2
Requires-Dist: prompt-toolkit>=3.0.52
Requires-Dist: scvi-tools>=1.4.0
Requires-Dist: torch>=2.0.0
Requires-Dist: langchain-anthropic>=0.3.20
Requires-Dist: responses>=0.25.8
Requires-Dist: nbformat>=5.9.0
Requires-Dist: papermill>=2.4.0
Requires-Dist: nbconvert>=7.0.0
Requires-Dist: jupytext>=1.15.0
Requires-Dist: docling>=2.60.0
Requires-Dist: docling-core>=2.50.0
Requires-Dist: redis>=6.4.0
Requires-Dist: pysradb>=2.5.1
Requires-Dist: claude-agent-sdk>=0.1.0
Requires-Dist: linkup>=0.1.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: pytest-xdist>=3.0.0; extra == "dev"
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
Requires-Dist: pytest-mock>=3.10.0; extra == "dev"
Requires-Dist: pytest-benchmark>=4.0.0; extra == "dev"
Requires-Dist: pytest-timeout>=2.1.0; extra == "dev"
Requires-Dist: pytest-html>=3.2.0; extra == "dev"
Requires-Dist: pytest-json-report>=1.5.0; extra == "dev"
Requires-Dist: factory-boy>=3.2.0; extra == "dev"
Requires-Dist: responses>=0.23.0; extra == "dev"
Requires-Dist: httpretty>=1.1.4; extra == "dev"
Requires-Dist: freezegun>=1.2.0; extra == "dev"
Requires-Dist: faker>=19.0.0; extra == "dev"
Requires-Dist: moto>=4.1.0; extra == "dev"
Requires-Dist: fakeredis>=2.10.0; extra == "dev"
Requires-Dist: ftputil>=5.0.4; extra == "dev"
Requires-Dist: memory-profiler>=0.60.0; extra == "dev"
Requires-Dist: psutil>=5.9.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: flake8>=6.0.0; extra == "dev"
Requires-Dist: pylint>=2.17.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Requires-Dist: bandit>=1.7.0; extra == "dev"
Requires-Dist: ruff>=0.0.300; extra == "dev"
Requires-Dist: responses>=0.25.8; extra == "dev"
Requires-Dist: pre-commit>=3.0.0; extra == "dev"
Requires-Dist: bumpversion>=0.6.0; extra == "dev"
Requires-Dist: twine>=4.0.0; extra == "dev"
Requires-Dist: build>=0.10.0; extra == "dev"
Requires-Dist: mkdocs>=1.5.0; extra == "dev"
Requires-Dist: mkdocs-material>=9.0.0; extra == "dev"
Requires-Dist: langfuse>=2.0.0; extra == "dev"
Requires-Dist: tabulate>=0.9.0; extra == "dev"
Provides-Extra: all
Requires-Dist: pytest>=7.0.0; extra == "all"
Requires-Dist: pytest-cov>=4.0.0; extra == "all"
Requires-Dist: pytest-xdist>=3.0.0; extra == "all"
Requires-Dist: pytest-asyncio>=0.20.0; extra == "all"
Requires-Dist: black>=23.0.0; extra == "all"
Requires-Dist: isort>=5.12.0; extra == "all"
Requires-Dist: flake8>=6.0.0; extra == "all"
Requires-Dist: pylint>=2.17.0; extra == "all"
Requires-Dist: mypy>=1.0.0; extra == "all"
Requires-Dist: bandit>=1.7.0; extra == "all"
Requires-Dist: pre-commit>=3.0.0; extra == "all"
Requires-Dist: bumpversion>=0.6.0; extra == "all"
Requires-Dist: twine>=4.0.0; extra == "all"
Requires-Dist: build>=0.10.0; extra == "all"
Requires-Dist: mkdocs>=1.5.0; extra == "all"
Requires-Dist: mkdocs-material>=9.0.0; extra == "all"
Requires-Dist: langfuse>=2.0.0; extra == "all"
Requires-Dist: tabulate>=0.9.0; extra == "all"
Dynamic: author
Dynamic: home-page
Dynamic: license-file
Dynamic: requires-python

# 🦞 Lobster AI

[![License: AGPL-3.0-or-later](https://img.shields.io/badge/License-AGPL%203.0--or--later-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
[![Documentation: CC BY 4.0](https://img.shields.io/badge/Documentation-CC%20BY%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by/4.0/)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)

**Bioinformatics co-pilot to automate redundant tasks so you can focus on science**

## 📋 Table of Contents

- [✨ What is Lobster AI?](#-what-is-lobster-ai)
- [⚡ Quick Start](#-quick-start)
- [💡 Example Usage](#-example-usage)
- [🧬 Features](#-features)
- [🚀 Installation](#-installation)
- [🔬 Literature Mining & Metadata](#-literature-mining--metadata)
- [🔧 Configuration](#-configuration)
- [🗓️ Roadmap](#-roadmap)
- [📚 Documentation](#-documentation)
- [🤝 Community & Support](#-community--support)
- [📄 License](#-license)

## ✨ What is Lobster AI?

Lobster AI is a bioinformatics platform that combines specialized AI agents with open-source tools to analyze complex multi-omics data, discover relevant literature, and manage metadata across datasets. Simply describe your analysis needs in natural language - no coding required.

**Perfect for:**
- Bioinformatics researchers analyzing RNA-seq data
- Computational biologists seeking intelligent analysis workflows
- Life science teams requiring reproducible, publication-ready results
- Students learning modern bioinformatics approaches

## ⚡ Quick Start

### Option 1: Global Installation (Recommended for CLI Use)

```bash
# Install uv if not already installed
# macOS/Linux: curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows: powershell -c "irm https://astral.sh/uv/install.ps1 | iex"

# Install Lobster globally
uv tool install lobster-ai

# Configure API keys
lobster init

# Start using Lobster
lobster chat
```

**Benefits**: Accessible from anywhere, clean uninstall, isolated environment.

### Option 2: Local Installation (For Projects/Development)

```bash
# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Install Lobster in virtual environment
uv pip install lobster-ai
# or: pip install lobster-ai

# Configure API keys
lobster init

# Start using Lobster
lobster chat
```

**Benefits**: Project-specific installation, doesn't affect system Python.

---

**Get API keys:** [Claude API](https://console.anthropic.com/) | [AWS Bedrock](https://aws.amazon.com/bedrock/)

**Setup wizard:** Run `lobster init` to launch the interactive configuration wizard. It will guide you through API key setup and save configuration to a `.env` file in your working directory.

**First analysis:**
```bash
lobster query "Download GSE109564 and perform clustering"
```

[See detailed installation options](#-installation) | [Configuration guide](https://github.com/the-omics-os/lobster-local/wiki/03-configuration)

## 💡 Example Usage

### Interactive Chat Mode

```bash
lobster chat

Welcome to Lobster AI - Your bioinformatics analysis assistant

🦞 You: Download GSE109564 do a QC run all preprocessing steps and perform single-cell clustering analysis

🦞 Lobster: I'll download and analyze this single-cell dataset for you...

✓ Downloaded 5,000 cells × 20,000 genes
✓ Quality control: filtered to 4,477 high-quality cells
✓ Identified 12 distinct cell clusters
✓ Generated UMAP visualization and marker gene analysis

Analysis complete! Results saved to workspace.

🦞 You: Now fetch the methods from the original publication. 
```

### Single Query Mode

For non-interactive analysis and automation:

```bash
# Basic usage
lobster query "download GSE109564 and perform quality control"

# With workspace context
lobster query --workspace ~/my_analysis "cluster the loaded dataset"

# Show reasoning process
lobster query --reasoning "differential expression between conditions"
```

### Natural Language Examples

```bash
# Download and analyze GEO datasets
🦞 You: "Download GSE12345 and perform quality control"

# Analyze your own data
🦞 You: "Load my_data.csv and identify differentially expressed genes"

# Generate visualizations
🦞 You: "Create a UMAP plot colored by cell type"

# Complex analyses
🦞 You: "Run pseudobulk aggregation and differential expression"
```

## 🧬 Features

### Current Capabilities

#### **Single-Cell RNA-seq Analysis**
- Quality control and filtering
- Normalization and scaling
- Clustering and UMAP visualization
- Cell type annotation
- Marker gene identification
- Pseudobulk aggregation

#### **Bulk RNA-seq Analysis**
- Differential expression with pyDESeq2
- R-style formula-based statistics
- Complex experimental designs
- Batch effect correction

#### **Data Management**
- Support for CSV, Excel, H5AD, 10X formats
- Multi-source dataset discovery (GEO, SRA, PRIDE, ENA)
- Literature mining and full-text retrieval
- Cross-dataset metadata harmonization
- Sample ID mapping and validation
- Automatic visualization generation

## 🚀 Installation

### Primary Method: PyPI (Recommended)

Install Lobster AI with a single command:

```bash
# Recommended: Use uv for faster installation
# Install uv: https://docs.astral.sh/uv/getting-started/installation/
uv pip install lobster-ai

# Alternative: pip install lobster-ai
```

**Configure API Keys:**

Run the configuration wizard to set up your API keys:

```bash
# Launch interactive configuration wizard
lobster init
```

The wizard will:
- Prompt you to choose between Claude API or AWS Bedrock
- Securely collect your API keys (input is masked)
- Optionally configure NCBI API key for enhanced literature search
- Create a `.env` file in your working directory

**Additional configuration commands:**
```bash
lobster config test   # Test API connectivity
lobster config show   # Display current configuration (secrets masked)
```

**Get API Keys:**
- **Claude API**: https://console.anthropic.com/
- **AWS Bedrock**: https://aws.amazon.com/bedrock/
- **NCBI API** (optional): https://ncbiinsights.ncbi.nlm.nih.gov/2017/11/02/new-api-keys-for-the-e-utilities/

**Advanced: Manual Configuration**

If you prefer, you can manually create a `.env` file in your working directory:

```bash
# Required: Choose ONE LLM provider

# Option 1: Claude API (Quick testing)
ANTHROPIC_API_KEY=sk-ant-api03-your-key-here

# Option 2: AWS Bedrock (Production)
AWS_BEDROCK_ACCESS_KEY=your-access-key
AWS_BEDROCK_SECRET_ACCESS_KEY=your-secret-key

# Optional: Enhanced literature search
NCBI_API_KEY=your-ncbi-key
NCBI_EMAIL=your.email@example.com
```

---

### Platform-Specific Installation

For native installation (development, advanced users):

- **macOS**: [Native Installation Guide](https://github.com/the-omics-os/lobster-local/wiki/02-installation#macos)
- **Linux**: [Ubuntu/Debian Guide](https://github.com/the-omics-os/lobster-local/wiki/02-installation#linux-ubuntudebian)
- **Windows**: [WSL Guide (Recommended)](https://github.com/the-omics-os/lobster-local/wiki/02-installation#windows)

**Complete installation guide:** [wiki/02-installation.md](https://github.com/the-omics-os/lobster-local/wiki/02-installation)

---

### ⚠️ Important: API Rate Limits

**Claude API:**
- ⚠️ Conservative rate limits for new accounts
- ✅ Best for: Testing, development, small datasets
- 📈 Upgrade: [Request limit increase](https://docs.anthropic.com/en/api/rate-limits)

**AWS Bedrock:**
- ✅ Enterprise-grade rate limits (recommended for production)
- ✅ Best for: Large-scale analysis, production deployments
- 🔗 Setup: [AWS Bedrock Guide](https://github.com/the-omics-os/lobster-local/wiki/02-installation#aws-bedrock-enhanced-setup)

If you encounter rate limit errors: [Troubleshooting Guide](https://github.com/the-omics-os/lobster-local/wiki/28-troubleshooting)

---

### Uninstalling Lobster AI

#### Remove Package

**If installed globally with uv tool:**
```bash
uv tool uninstall lobster-ai
```

**If installed locally in virtual environment:**
```bash
# Activate the virtual environment first
source .venv/bin/activate  # Windows: .venv\Scripts\activate

# Uninstall
pip uninstall lobster-ai

# Remove virtual environment (optional)
deactivate
rm -rf .venv
```

**If installed with make (developers):**
```bash
cd /path/to/lobster
make uninstall-global  # Remove global symlink
make uninstall         # Remove virtual environment
```

#### Remove User Data (Optional)

⚠️ **Warning**: This deletes all your analysis data, notebooks, and workspaces!

```bash
# Remove all user data
rm -rf ~/.lobster
rm -rf ~/.lobster_workspace

# Remove project configuration
rm .env  # In your project directory
```

#### Verify Complete Removal

```bash
# Check command removed
which lobster  # Should output nothing

# Check tool not listed (if using uv tool)
uv tool list | grep lobster  # Should output nothing
```

## 🔬 Literature Mining & Metadata

Lobster AI automatically searches scientific literature and extracts key information to inform your analyses:

- **Search across databases** - Find relevant papers from PubMed, bioRxiv, and other repositories
- **Full-text retrieval** - Automatically access complete articles when available
- **Methods extraction** - Extract experimental protocols, software parameters, and statistical approaches
- **Dataset discovery** - Search across GEO, SRA, PRIDE, and ENA databases
- **Metadata harmonization** - Convert diverse metadata formats to common schemas
- **Sample ID mapping** - Match samples between different omics datasets

### Natural Language Examples

```bash
# Literature discovery
🦞 You: "Find recent papers about CRISPR screens in cancer"

# Dataset search
🦞 You: "Search GEO for single-cell datasets of pancreatic beta cells"

# Cross-dataset operations
🦞 You: "Concatenate multiple single-cell RNA-seq batches and correct for batch effects"

# Automated extraction
🦞 You: "What analysis parameters did the authors use in PMID:35042229?"
```

## 🔧 Configuration

Lobster AI is configured via the `.env` file in your working directory.

**Works for both global and local installations:**

```bash
# Interactive configuration wizard
lobster init

# Test configuration
lobster config test

# View current configuration
lobster config show
```

**Manual configuration** (advanced users - edit `.env` file):

```bash
# Option A: Claude API
ANTHROPIC_API_KEY=sk-ant-api03-your-key-here

# Option B: AWS Bedrock
AWS_BEDROCK_ACCESS_KEY=your-access-key
AWS_BEDROCK_SECRET_ACCESS_KEY=your-secret-key

# Optional: Enhanced literature search
NCBI_API_KEY=your-ncbi-api-key
NCBI_EMAIL=your.email@example.com

# Optional: Performance tuning
LOBSTER_PROFILE=production
LOBSTER_MAX_FILE_SIZE_MB=500
```

**CI/CD and automation:**
```bash
# Non-interactive mode for scripts and CI/CD
lobster init --non-interactive --anthropic-key=sk-ant-xxx
lobster init --non-interactive --bedrock-access-key=xxx --bedrock-secret-key=yyy
```

**Complete configuration guide:** [wiki/03-configuration.md](https://github.com/the-omics-os/lobster-local/wiki/03-configuration)

## 🗓️ Roadmap

Lobster follows an **open-core model**: core transcriptomics is open source, advanced features in premium tiers.

**Open Source (lobster-local):**
- ✅ Single-cell & bulk RNA-seq analysis
- ✅ Literature mining & dataset discovery
- ✅ Protein structure visualization

**Premium Features:**
- Q1 2025: Proteomics platform (DDA/DIA workflows)
- Q2 2025: AI agent toolkit & custom feature generation
- Q3 2025: Lobster Cloud (SaaS, $6K-$30K/year)

**Target:** 50 paying customers, $810K ARR by Month 18

[Full roadmap & pricing](https://github.com/the-omics-os/lobster-local/wiki) | [Contact for enterprise access](mailto:info@omics-os.com)

## 📚 Documentation

- [Full Documentation](https://github.com/the-omics-os/lobster-local/wiki) - Guides and tutorials
- [Example Analyses](https://github.com/the-omics-os/lobster-local/wiki/27-examples-cookbook) - Real-world use cases
- [Architecture Overview](https://github.com/the-omics-os/lobster-local/wiki/18-architecture-overview) - Technical details
- [API Reference](https://github.com/the-omics-os/lobster-local/wiki/13-api-overview) - Complete API documentation

## 🤝 Community & Support

- 🐛 [Report Issues](https://github.com/the-omics-os/lobster-local/issues) - Bug reports and feature requests
- 📧 [Email Support](mailto:info@omics-os.com) - Direct help from our team

### Enterprise Solutions

Need custom integrations or dedicated support? [Contact us](mailto:info@omics-os.com)

## 📄 License

Lobster AI is open source under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later). This license ensures that all users, including those accessing the software over a network, receive the freedoms to use, study, share, and modify the software. The AGPL-3.0 license is compatible with GPL-licensed dependencies used in our bioinformatics toolchain.

For commercial licensing options or questions about license compatibility, please contact us at info@omics-os.com.

Documentation is licensed CC-BY-4.0. Contributions are accepted under a Contributor License Agreement to preserve future licensing flexibility.

---

<div align="center">

**Transform Your Bioinformatics Research Today**

[Get Started](#-quick-start) • [Documentation](https://github.com/the-omics-os/lobster-local/wiki)

*Made with ❤️ by [Omics-OS](https://omics-os.com)*

</div>
