Metadata-Version: 2.2
Name: titans-unofficial
Version: 1.0.6
Summary: Unofficial PyTorch implementation of Titans: Learning to Memorize at Test Time
Home-page: https://github.com/Shehryar718/titans-unofficial
Author: Shehryar Sohail
Author-email: hafizshehryar@gmail.com
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: torch==2.5.1
Requires-Dist: numpy==1.22.0
Requires-Dist: einops==0.8.0
Requires-Dist: transformers==4.47.1
Requires-Dist: matplotlib==3.8.4
Requires-Dist: tqdm==4.67.1
Requires-Dist: pytest==8.3.4
Provides-Extra: dev
Requires-Dist: pytest>=8.3.4; extra == "dev"
Requires-Dist: black>=22.0.0; extra == "dev"
Requires-Dist: isort>=5.10.0; extra == "dev"
Requires-Dist: flake8>=4.0.0; extra == "dev"
Dynamic: author
Dynamic: author-email
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: provides-extra
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary

# Unofficial Implementation of Titans: Learning to Memorize at Test Time

[![PyPI version](https://img.shields.io/pypi/v/titans-unofficial.svg)](https://pypi.org/project/titans-unofficial/)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/release/python-3100/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

This is an unofficial PyTorch implementation of the paper ["Titans: Learning to Memorize at Test Time"](https://arxiv.org/abs/2501.00663) by Ali Behrouz, Peilin Zhong, and Vahab Mirrokni.

## Overview

Titans is a novel neural architecture that combines attention-based short-term memory with a neural long-term memory module. The architecture addresses the limitations of both recurrent models (which compress data into fixed-size memory) and attention mechanisms (which have quadratic complexity).

### Key Features

- **Neural Long-term Memory**: A module that learns to memorize historical context
- **Persistent Memory**: Learnable tokens that encode task-specific knowledge
- **Three Architectural Variants**:
  - MAC (Memory as Context): Uses memory as context for attention
  - MAG (Memory as Gate): Combines memory with core branch using gating
  - MAL (Memory as Layer): Integrates memory as a separate layer

## Installation

### From PyPI (Recommended)

```bash
pip install titans-unofficial
```

### From Source (Development)

```bash
# Clone the repository
git clone https://github.com/Shehryar718/titans-unofficial.git
cd titans-unofficial

# Install in development mode with all extras
pip install -e ".[dev]"
```

## Requirements

- Python 3.10+
- PyTorch 2.0+
- transformers (for tokenization)
- numpy
- pytest (for running tests)

## Project Structure

```
titans-unofficial/
├── titans/
│   ├── __init__.py
│   ├── models/
│   │   ├── titans_base.py    # Base class for all variants
│   │   ├── titans_mac.py     # Memory as Context implementation
│   │   ├── titans_mag.py     # Memory as Gate implementation
│   │   └── titans_mal.py     # Memory as Layer implementation
│   └── utils/
│       ├── memory.py         # Neural Memory Module
│       ├── attention.py      # Attention mechanisms
│       └── persistent_memory.py  # Persistent Memory implementation
├── examples/
│   ├── text_classification.py  # Text classification example
│   ├── language_modeling.py    # Language modeling example
│   └── fine_tuning.py         # Fine-tuning example
├── pytests/
│   └── test_memory.py         # Tests for memory module
├── requirements.txt
├── LICENSE
└── README.md
```

## Usage

### Text Classification

```python
from titans import TitansMAC, TitansMAG, TitansMAL
from examples.text_classification import TitansForClassification

# Initialize model
model = TitansForClassification(
    vocab_size=30000,
    d_model=128,
    n_layers=2,
    n_heads=4,
    num_classes=2,
    memory_depth=2,
    persistent_tokens=8,
    window_size=16,
    model_type="mal"  # Choose from: "mac", "mag", "mal"
)
```

```bash
# Train and evaluate
python examples/text_classification.py
```

### Language Modeling

```python
from titans import TitansMAC, TitansMAG, TitansMAL
from examples.language_modeling import TitansForLanguageModeling

# Initialize model
model = TitansForLanguageModeling(
    vocab_size=30000,
    d_model=128,
    n_layers=2,
    n_heads=4,
    memory_depth=2,
    persistent_tokens=16,
    window_size=128,
    model_type="mac"
)
```

```bash
# Train and generate text
python examples/fine_tuning.py
```

## Architecture Details

### Neural Memory Module

The neural memory module consists of:
- Key/Value/Query projections for memory access
- Multi-layer perceptron for memory processing
- Momentum-based update mechanism with configurable parameters
- Weight decay for forgetting mechanism
- Gradient scaling for numerical stability

### Variants

1. **MAC (Memory as Context)**
   - Memory output serves as additional context
   - Efficient for tasks requiring long-range dependencies
   - Parallel processing with chunked computation
   - Configurable chunk size and parallel processing

2. **MAG (Memory as Gate)**
   - Gating mechanism to combine memory with core processing
   - Adaptive balance between short and long-term memory
   - Enhanced numerical stability
   - Improved gradient flow through gating

3. **MAL (Memory as Layer)**
   - Memory integrated as a separate layer
   - Direct memory access at each layer
   - Sliding window attention for efficiency
   - Layer-wise memory updates

## Example Tasks

The repository includes implementations for:
- Text Classification (Binary and multi-class)
- Language Modeling with test-time adaptation
- Fine-tuning with early stopping

Each example demonstrates different aspects of the Titans architecture:
- Memory reset between epochs for fresh adaptation
- Efficient batch processing with dynamic batching
- Gradient scaling for numerical stability
- Early stopping and model checkpointing
- Proper memory state management

## Testing

```bash
# Run all tests
pytest pytests/

# Run specific test file
pytest pytests/test_memory.py
```

## Citation

This repository provides an **unofficial implementation** of the Titans architecture.  
If you reference this work, please **cite the original paper**:

```bibtex
@article{behrouz2024titans,
  title={Titans: Learning to Memorize at Test Time},
  author={Behrouz, Ali and Zhong, Peilin and Mirrokni, Vahab},
  journal={arXiv preprint arXiv:2501.00663},
  year={2024}
}
```

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

