Metadata-Version: 2.4
Name: cosmos_rl
Version: 0.1.9
Summary: Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.
Project-URL: Homepage, https://github.com/nvidia-cosmos/cosmos-rl
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi
Requires-Dist: uvicorn
Requires-Dist: transformers!=4.52.*,!=4.53.*,>=4.51.1
Requires-Dist: datasets
Requires-Dist: pydantic
Requires-Dist: math_verify
Requires-Dist: pybind11[global]
Requires-Dist: vllm>=0.8.5
Requires-Dist: torch>=2.6.0
Requires-Dist: torchvision>=0.21.0
Requires-Dist: torchao>=0.10.0
Requires-Dist: tensordict==0.7.2
Requires-Dist: toml
Requires-Dist: StrEnum
Requires-Dist: redis
Requires-Dist: msgpack
Requires-Dist: qwen_vl_utils
Requires-Dist: flash-attn>=2.7.4.post1
Requires-Dist: pynvml
Requires-Dist: boto3
Requires-Dist: modelscope
Requires-Dist: cloudpickle
Requires-Dist: click
Requires-Dist: rich
Requires-Dist: pyyaml
Requires-Dist: blobfile
Requires-Dist: nvidia-nccl-cu12>=2.26.2
Dynamic: license-file

<p align="center">
    <img src="https://raw.githubusercontent.com/nvidia-cosmos/cosmos-rl/main/assets/nvidia-cosmos-header.png" alt="NVIDIA Cosmos Header">
</p>


## Getting Started

Cosmos-RL is a flexible and scalable Reinforcement Learning framework specialized for Physical AI applications.

[Documentation](https://nvidia-cosmos.github.io/cosmos-rl).

## System Architecture
Cosmos-RL provides toolchain to enable large scale RL training workload with following features:
1. **HuggingFace Integration**
    - Llama-2
    - Llama-3
    - Qwen-2.5
    - Qwen-2.5-VL
    - Qwen-3
    - Qwen-3-MoE
    - Moonlight-MoE
    - All HF LLMs
2. **Parallelism**
    - Tensor Parallelism
    - Sequence Parallelism
    - Context Parallelism
    - FSDP Parallelism
    - Pipeline Parallelism
3. **Fully asynchronous (replicas specialization)**
    - Policy (Consumer): Replicas of training instances
    - Rollout (Producer): Replicas of generation engines
    - Low-precision training (FP8) and rollout (FP8 & FP4) support
4. **Single-Controller Architecture**
    - Efficient messaging system (e.g., `weight-sync`, `rollout`, `evaluate`) to coordinate policy and rollout replicas
    - Dynamic NCCL Process Groups for on-the-fly GPU [un]registration to enable fault-tolerant and elastic large-scale RL training

![Policy-Rollout-Controller Decoupled Architecture](https://raw.githubusercontent.com/nvidia-cosmos/cosmos-rl/main/assets/rl_infra.svg)

## License and Contact

This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.

NVIDIA Cosmos source code is released under the [Apache 2 License](https://www.apache.org/licenses/LICENSE-2.0).

NVIDIA Cosmos models are released under the [NVIDIA Open Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license). For a custom license, please contact [cosmos-license@nvidia.com](mailto:cosmos-license@nvidia.com).
