Metadata-Version: 2.4
Name: NL2SQLEvaluator
Version: 1.1.9
Summary: Add your description here
License-File: LICENSE
Requires-Python: >=3.10
Requires-Dist: datasets>=4.0.0
Requires-Dist: fastparquet>=2024.11.0
Requires-Dist: func-timeout>=4.3.5
Requires-Dist: langgraph>=0.4.0
Requires-Dist: loguru>=0.7.3
Requires-Dist: numpy>=1.9.0
Requires-Dist: pandas>=2.3.2
Requires-Dist: pyarrow>=21.0.0
Requires-Dist: pydantic>=2.11.7
Requires-Dist: pymysql>=1.1.1
Requires-Dist: python-dotenv>=1.1.1
Requires-Dist: sqlalchemy>=2.0.41
Requires-Dist: sqlglot>=27.11.0
Requires-Dist: tqdm>=4.67.1
Requires-Dist: wandb>=0.21.3
Description-Content-Type: text/markdown

# NL2SQLEvaluator

# Roadmap
- [ ] Add MySQL database executor 
- [ ] Add Precision, Recall, F1 metrics for ambiguity Text2SQL datasets

👷🏼‍♂️ Work in progress 

# Configuration Guide

## 🔧 Use a YAML config (with CLI overrides)

Run your experiment with a config file:

```bash
nl2sql_eval --config path/to/config.yaml
```

### Example `config.yaml`

```yaml
# Core
output_dir: ./outputs
seed: 42

# Dataset
relative_db_base_path: data/bird_dev/dev_databases
dataset_path: simone-papicchio/bird
dataset_name: bird-dev

# Model
model_name: Qwen3-Coder-30B
model: Qwen/Qwen3-Coder-30B-A3B-Instruct
temperature: 0.7
top_p: 0.8
top_k: 20
repetition_penalty: 1.05
max_tokens: 32000

# Weights & Biases
project: text2sql-eval
entity: spapicchio-politecnico-di-torino   # or your team
group: evals
mode: online                               # or "offline" on clusters without net
tags: [eval, seg]
notes: ""
job_type: eval
```

### Override any value from the CLI

Command-line flags take precedence over the YAML:

```bash
nl2sql_eval --config config.yaml \
  --output_dir ./outputs/run-42 \
  --mode offline \
  --temperature 0.2 \
  --max_tokens 4096 \
  --tags eval --tags ablation
```

### Notes

* The config is **flat** (all keys at top level) so it works smoothly with the parser.
* Lists (e.g., `tags`) can be provided in YAML or by repeating the flag in CLI (`--tags ...` multiple times).
* Booleans accept `true/false` in YAML and `--flag true/false` in CLI.
* This package uses TRL’s `TRLParser` / HF’s `HfArgumentParser` under the hood, so the same configuration behaviors apply.


> This package relies on [TrlParser](https://huggingface.co/docs/trl/main/en/script_utils) so all the configurations available there can be used as well.