Metadata-Version: 2.4
Name: dxtrx
Version: 0.0.5
Summary: Dagster extra utilities for data processing
License: MIT
Requires-Python: >=3.11.8
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: polars>=0.20.0
Requires-Dist: google-cloud-storage>=1.0.0
Requires-Dist: requests>=2.0.0
Requires-Dist: msal>=1.0.0
Requires-Dist: pyyaml>=5.0.0
Requires-Dist: pandas>=1.0.0
Requires-Dist: python-dotenv>=0.19.0
Requires-Dist: pyarrow>=10.0.0
Requires-Dist: sqlalchemy>=1.4.0
Requires-Dist: psycopg2-binary>=2.8.0
Requires-Dist: duckdb-engine>=0.9.0
Requires-Dist: fastexcel>=0.10.0
Requires-Dist: orjson>=3.0.0
Requires-Dist: sqlglot>=20.0.0
Requires-Dist: aiohttp>=3.8.0
Requires-Dist: openpyxl>=3.0.0
Requires-Dist: fsspec>=2022.1.0
Requires-Dist: gcsfs>=2022.1.0
Provides-Extra: dev
Requires-Dist: pytest; extra == "dev"
Requires-Dist: pytest-mock; extra == "dev"
Requires-Dist: pytest-cov; extra == "dev"
Requires-Dist: pytest-asyncio; extra == "dev"
Requires-Dist: pytest-watch; extra == "dev"
Dynamic: license-file

# Dxtr Dagster Library

A Python library that provides utilities and components for data engineering workflows using Dagster. The library focuses on data processing capabilities including downloading data from Sharepoint, loading to PostgreSQL, and performing data transformations.

## Project Structure

The library is organized into the following components:

```
dxtr/
├── dxtr/     # Main library package
│   ├── dagster/          # Dagster-specific components and resources
│   └── utils/            # Utility functions
├── pyproject.toml        # Project configuration and dependencies
└── README.md            # This file
```

## Features

- Sharepoint data file downloading
- SQLAlchemy data loading
- Data transformation capabilities
- Integration with Dagster for workflow orchestration

## Dependencies

The library requires Python 3.11.8 or higher and includes key dependencies such as:
- polars
- google-cloud-storage
- requests
- msal
- pandas
- sqlalchemy
- psycopg2-binary
- and more (see pyproject.toml for complete list)

## Development

### Installation

For development purposes, install the package in editable mode:
```bash
pip install -e ".[dev] --config-settings editable_mode=compat"
```

Please refer to the Wiki to usage of `./dxtrx.sh` to setup the environment and start the Dagster code server a more convenient way of working with this code.

The library requires several environment variables to be set:
- Sharepoint credentials
- Database credentials
- Other configuration variables

Please refer to the Wiki for detailed setup instructions using `./dxtrx.sh` to configure the environment and start the Dagster code server.

### Contributing Guidelines

When contributing to this library:
1. Follow the existing code structure and naming conventions
2. Add new components in the appropriate directories
3. Update documentation as needed
4. Test changes locally
5. Submit PRs with evidence of testing and team review

#### Running tests

To run the tests, use the following command:
```bash
pytest
```

Or you can also run them in watching mode:
```bash
ptw
```

