Metadata-Version: 2.4
Name: noetl
Version: 0.1.27
Summary: A framework to build and run data pipelines and workflows.
Author-email: Kadyapam <182583029+kadyapam@users.noreply.github.com>
License-Expression: MIT
Project-URL: Homepage, https://noetl.io
Project-URL: Repository, https://github.com/noetl/noetl
Project-URL: Issues, https://github.com/noetl/noetl/issues
Keywords: etl,data,pipeline,workflow,automation
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: fastapi>=0.115.6
Requires-Dist: pydantic>=2.11.4
Requires-Dist: aiofiles==24.1.0
Requires-Dist: psycopg[binary,pool]>=3.2.7
Requires-Dist: connectorx>=0.4.3
Requires-Dist: greenlet>=3.2.1
Requires-Dist: uvicorn>=0.34.0
Requires-Dist: requests>=2.32.3
Requires-Dist: httpx>=0.28.1
Requires-Dist: google-auth>=2.27.0
Requires-Dist: python-multipart==0.0.20
Requires-Dist: PyYAML>=6.0.1
Requires-Dist: Jinja2>=3.1.6
Requires-Dist: pycryptodome>=3.21
Requires-Dist: urllib3>=2.3
Requires-Dist: Authlib>=1.5.1
Requires-Dist: typer>=0.15.3
Requires-Dist: click<8.2.1,>=8.1.0
Requires-Dist: psutil>=7.0.0
Requires-Dist: memray>=1.17.2
Requires-Dist: deepdiff>=8.4.2
Requires-Dist: pandas>=2.2.3
Requires-Dist: lark>=1.2.2
Requires-Dist: duckdb>=1.3.0
Requires-Dist: duckdb-engine>=0.17.0
Requires-Dist: polars[pyarrow]>=1.30.0
Requires-Dist: matplotlib>=3.10.3
Requires-Dist: networkx>=3.5
Requires-Dist: pydot>=4.0.1
Requires-Dist: jupysql>=0.11.1
Requires-Dist: jupyterlab>=4.4.3
Requires-Dist: fsspec>=2025.5.1
Requires-Dist: gcsfs>=2025.5.1
Requires-Dist: boto3>=1.38.45
Requires-Dist: azure-identity>=1.23.0
Requires-Dist: azure-keyvault-secrets>=4.8.0
Provides-Extra: dev
Requires-Dist: pytest>=8.1.1; extra == "dev"
Requires-Dist: pytest-asyncio>=0.23.6; extra == "dev"
Requires-Dist: pytest-cov>=4.1.0; extra == "dev"
Requires-Dist: pytest-mock>=3.14.0; extra == "dev"
Provides-Extra: publish
Requires-Dist: build>=1.2.2.post1; extra == "publish"
Requires-Dist: twine>=6.1.0; extra == "publish"
Dynamic: license-file

# Not Only ETL

__NoETL__ is an automation framework for data processing and MLOps orchestration.

[![PyPI version](https://badge.fury.io/py/noetl.svg)](https://badge.fury.io/py/noetl)
[![Python Version](https://img.shields.io/pypi/pyversions/noetl.svg)](https://pypi.org/project/noetl/)
[![License](https://img.shields.io/pypi/l/noetl.svg)](https://github.com/noetl/noetl/blob/main/LICENSE)

## Quick Start

### Installation

- Install NoETL from PyPI:
  ```bash
  pip install noetl
  ```

For development or specific versions:
- Install in a virtual environment
  ```bash
  python -m venv .venv
  source .venv/bin/activate
  pip install noetl
  ```
- For Windows users (in PowerShell)
  ```bash
  python -m venv .venv
  Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
  .venv\Scripts\Activate.ps1
  pip install noetl
  ```
- Install a specific version
  ```bash
  pip install noetl==0.1.24
  ```

### Prerequisites

- Python 3.11+
- For full functionality:
  - Postgres database (mandatory, for event log persistent storage and NoETL system metadata)
  - Docker (optional, for containerized development and deployment)

## Basic Usage

After installing NoETL:

### 1. Run the NoETL Server

Start the NoETL server to access the web UI and REST API:

```bash
noetl server
```

This starts the server on http://localhost:8082 by default.

### 2. Using the Command Line

NoETL has a command-line interface for executing playbooks:

- Register a playbook in the catalog
```bash
noetl playbooks --register ./<path to playbooks folder>/playbooks.yaml
```
- Execute a playbook from the catalog
```bash
noetl playbooks --execute --path "workflows/example/playbook"
```
- Execute a playbook directly
```bash
noetl agent -f ./<path to playbooks folder>/playbooks.yaml
```

### 3. Docker Deployment

For containerized deployment:

```bash
docker pull noetl/noetl:latest
docker run -p 8082:8082 noetl/noetl:latest
```

## Workflow DSL Structure

NoETL uses a declarative YAML-based Domain Specific Language (DSL) for defining workflows. The key components of a NoETL playbook include:

- **Metadata**: Version, path, and description of the playbook
- **Workload**: Input data and parameters for the workflow
- **Workflow**: A list of steps that make up the workflow, where each step is defined with `step: step_name`, including:
  - **Steps**: Individual operations in the workflow
  - **Tasks**: Actions performed at each step (HTTP requests, database operations, Python code)
  - **Transitions**: Rules for moving between steps
  - **Conditions**: Logic for branching the workflow
- **Workbook**: Reusable task definitions that can be called from workflow steps, including:
  - **Task Types**: Python, HTTP, DuckDB, PostgreSQL, Secret.
  - **Parameters**: Input parameters for the tasks
  - **Code**: Implementation of the tasks

For examples of NoETL playbooks and detailed explanations, see the [Examples Guide](https://github.com/noetl/noetl/blob/master/docs/examples.md).

To run a playbook:

```bash
noetl agent -f path/to/playbooks.yaml
```

## Documentation

For more detailed information, please refer to the following documentation:

> **Note:**  
> When installed from PyPI, the `docs` folder is included in your local package.  
> You can find all documentation files in the `docs/` directory of your installed package.

### Getting Started
- [Installation Guide](https://github.com/noetl/noetl/blob/master/docs/installation.md) - Installation instructions
- [CLI Usage Guide](https://github.com/noetl/noetl/blob/master/docs/cli_usage.md) - Commandline interface usage
- [API Usage Guide](https://github.com/noetl/noetl/blob/master/docs/api_usage.md) - REST API usage
- [Docker Usage Guide](https://github.com/noetl/noetl/blob/master/docs/docker_usage.md) - Docker deployment

### Core Concepts
- [Playbook Structure](https://github.com/noetl/noetl/blob/master/docs/playbook_structure.md) - Structure of NoETL playbooks
- [Workflow Tasks](https://github.com/noetl/noetl/blob/master/docs/action_type.md) - Action types and parameters
- [Environment Configuration](https://github.com/noetl/noetl/blob/master/docs/environment_variables.md) - Setting up environment variables


### Advanced Examples

NoETL includes several example playbooks that demonstrate more advanced capabilities:

- **Weather API Integration** - Fetches and processes weather data from external APIs
- **Database Operations** - Demonstrates Postgres and DuckDB integration
- **Google Cloud Storage** - Shows secure cloud storage operations with Google Cloud
- **Secrets Management** - Illustrates secure handling of credentials and sensitive data
- **Multi-Playbook Workflows** - Demonstrates complex workflow orchestration

For detailed examples, see the [Examples Guide](https://github.com/noetl/noetl/blob/master/docs/examples.md).

## Development

For information about contributing to NoETL or building from source:

- [Development Guide](https://github.com/noetl/noetl/blob/master/docs/development.md) - Setting up a development environment
- [PyPI Publishing Guide](https://github.com/noetl/noetl/blob/master/docs/pypi_manual.md) - Building and publishing to PyPI

## Community & Support

- **GitHub Issues**: [Report bugs or request features](https://github.com/noetl/noetl/issues)
- **Documentation**: [Full documentation](https://noetl.io/docs)
- **Website**: [https://noetl.io](https://noetl.io)

## License

NoETL is released under the MIT License. See the [LICENSE](LICENSE) file for details.
