Metadata-Version: 2.4
Name: oect-infra
Version: 1.0.5
Summary: OECT (Organic Electrochemical Transistor) data processing infrastructure for experiment management, feature engineering, and analysis
Author-email: lidonghao <lidonghao100@outlook.com>
License: MIT
Project-URL: Homepage, https://github.com/Durian-leader/oect-infra-package
Project-URL: Documentation, https://github.com/Durian-leader/oect-infra-package/blob/main/README.md
Project-URL: Repository, https://github.com/Durian-leader/oect-infra-package
Project-URL: Issues, https://github.com/Durian-leader/oect-infra-package/issues
Keywords: OECT,data-processing,scientific-computing,hdf5,feature-engineering,experiment-management,materials-science,electrochemistry
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Chemistry
Classifier: Topic :: Scientific/Engineering :: Physics
Requires-Python: >=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: h5py>=3.8.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: numpy>=1.24.0
Requires-Dist: matplotlib>=3.7.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: scipy>=1.10.0
Requires-Dist: scikit-learn>=1.3.0
Requires-Dist: PyYAML>=6.0
Requires-Dist: Pillow>=10.0.0
Requires-Dist: tabulate>=0.9.0
Requires-Dist: python-pptx>=0.6.21
Requires-Dist: numba>=0.57.0
Requires-Dist: opencv-python>=4.8.0
Requires-Dist: pyarrow>=14.0.0
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
Requires-Dist: black>=23.0.0; extra == "dev"
Requires-Dist: isort>=5.12.0; extra == "dev"
Requires-Dist: mypy>=1.0.0; extra == "dev"
Dynamic: license-file

# OECT-Infra

**A comprehensive data processing infrastructure for OECT (Organic Electrochemical Transistor) experiments**

[![PyPI version](https://badge.fury.io/py/oect-infra.svg)](https://badge.fury.io/py/oect-infra)
[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)

## Overview

OECT-Infra is an end-to-end platform that transforms raw experimental data into high-performance structured formats, providing standardized feature engineering, visualization, and reporting capabilities for OECT research.

### Key Features

- **🔄 Data Conversion**: Parallel batch conversion from CSV/JSON to standardized HDF5 format
- **📊 Lazy-Loading API**: Efficient access to experimental metadata and measurement data with intelligent caching
- **🔧 Feature Engineering**:
  - **V1**: Extract transfer characteristics (gm, Von, |I|, etc.) in columnar HDF5 format
  - **V2**: Advanced DAG-based extraction with YAML configs, Parquet storage, and HuggingFace-style API
- **📁 Unified Data Catalog**: SQLite-based indexing with bidirectional file↔database sync
- **📈 Visualization**: High-performance plotting with animation/video export
- **📄 Automated Reporting**: Configurable PowerPoint generation for stability analysis
- **📉 Degradation Analysis**: 17+ power law models with multi-metric comparison framework

## Installation

```bash
pip install oect-infra
```

### Requirements

- Python 3.11 or higher
- Core dependencies: h5py, pandas, numpy, matplotlib, pydantic, scipy, scikit-learn, PyYAML

## Quick Start

### Using the Unified Interface

```python
from infra.catalog import UnifiedExperimentManager

# Initialize manager
manager = UnifiedExperimentManager('catalog_config.yaml')

# Get an experiment
exp = manager.get_experiment(chip_id="#20250804008", device_id="3")

# Access data
transfer_data = exp.get_transfer_data()
features = exp.get_features(['gm_max_forward', 'Von_forward'])

# Visualization
fig = exp.plot_transfer_evolution()
```

### Using the Command-Line Interface

```bash
# Initialize catalog system
catalog init --auto-config

# Scan and index HDF5 files
catalog scan --path data/raw --recursive

# Synchronize data
catalog sync --direction both

# Query experiments
catalog query --chip "#20250804008" --output table

# Extract Features V2
catalog v2 extract-batch --feature-config v2_ml_ready --workers 4
```

### Features V2 Extraction

```python
# Single experiment with V2
exp = manager.get_experiment(chip_id="#20250804008", device_id="3")
result_df = exp.extract_features_v2('v2_transfer_basic', output_format='dataframe')

# Batch extraction
experiments = manager.search(chip_id="#20250804008")
result = manager.batch_extract_features_v2(
    experiments=experiments,
    feature_config='v2_ml_ready',
    save_format='parquet',
    n_workers=4
)
```

## Architecture

### Layered Design

**Core Foundation (L0)**
- `csv2hdf`: Data conversion
- `experiment`: Data access
- `oect_transfer`: Transfer characteristics analysis
- `features`: Feature storage

**Business Application (L1)**
- `features_version`: Feature workflows V1
- `features_v2`: Feature engineering V2 system
- `visualization`: Plotting tools

**Application Integration (L2)**
- `catalog`: Unified management
- `stability_report`: Report generation

### Data Flow Pipeline

```
CSV/JSON → csv2hdf → Raw HDF5 → experiment (lazy-loading)
         → [V1] oect_transfer & features_version → Feature HDF5
         → [V2] features_v2 (DAG compute graph) → Feature Parquet
         → catalog (indexing + workflow metadata) → visualization/stability_report
```

## Configuration

OECT-Infra uses YAML configuration files. Create a `catalog_config.yaml`:

```yaml
roots:
  raw_data: "data/raw"
  features_v1: "data/features"
  features_v2: "data/features_v2"

database:
  path: "catalog.db"

sync:
  conflict_strategy: "keep_newer"
```

## Documentation

- [Complete Documentation](https://github.com/Durian-leader/oect-infra-package/blob/main/README.md)
- Package documentation included in the installed package
- See `infra/` subdirectory for detailed module documentation

## Examples

Check out example notebooks in the source [repository](https://github.com/Durian-leader/oect-infra-package):
- Example notebooks and scripts included in package
- Comprehensive API documentation in module docstrings

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## Citation

If you use OECT-Infra in your research, please cite:

```bibtex
@software{oect_infra,
  author = {lidonghao},
  title = {OECT-Infra: Data Processing Infrastructure for OECT Experiments},
  year = {2025},
  url = {https://github.com/Durian-leader/oect-infra-package}
}
```

## Support

For issues and questions:
- GitHub Issues: https://github.com/Durian-leader/oect-infra-package/issues
- Email: lidonghao100@outlook.com

## Acknowledgments

This project was developed for OECT (Organic Electrochemical Transistor) research, providing tools for efficient data management, analysis, and visualization in materials science and electrochemistry research.
