Metadata-Version: 2.1
Name: KAFYTraj
Version: 0.1.16
Summary: This library includes an extensible system for building various trajectory operations.
Author: Youssef Hussein
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: h3>=3.7.0
Requires-Dist: transformers
Requires-Dist: datasets
Requires-Dist: tokenizers
Requires-Dist: pyarrow<16.0.0,>=14.0.1

# KafyTraj [KAFY] Library

The KafyTraj, KAFY for short, library provides an extensible system for various trajectory operations and includes a versatile command-line interface (CLI) for managing these operations using SQL-like commands. It functions as a toolkit for researchers, facilitating the construction, management, and execution of diverse trajectory operations.

### The Meaning Behind the Name “KAFYTraj”
	•	KAFY: means sufficient in Arabic.
	•	Traj: Short for “Trajectory,” emphasizing the library’s focus on trajectory data processing and analysis.
## Features

- **Extensible System**: Designed to accommodate various trajectory operations with flexibility and ease.
- **User-Friendly**: Simplifies the construction and management of trajectory operations, making it accessible to researchers.
- **SQL-like Command Interface**: Allows users to execute trajectory operations through intuitive, SQL-like commands.

## Installation

You can install the KAFY library using pip:

```bash
pip install KAFYTraj==0.1.15
```

## Usage

To use the library, import the `TrajectoryPipeline` class from `KAFY`:

```python
from KAFY import TrajectoryPipeline

# Initialize the pipeline
my_pipeline = TrajectoryPipeline(
    mode="pretraining",
    operation_type="generation",
    use_tokenization=True,
    use_detokenization=True,
    use_spatial_constraints=True,
    modify_spatial_constraints=True,
    use_predefined_spatial_constraints=True,
    project_path="/content/"
)
```

# Defining Project Location

Before starting, you need to define the project location where all project-related data will be saved. This includes directories like modelsRepo, TrajectoryStore, etc. By default, the project location is set to `/KafyProject/`

If you wish to change the project location permanently, follow these steps:

Step-by-Step Guide to Setting Project Location in bash

	1.	Open your terminal.
	2.	Open the .bashrc file in a text editor:
```bash
>>> nano ~/.bashrc
```
	3.	Add the following line to the end of the file, replacing /new/project/location with your desired directory:
```bash
>>> export KAFY_PROJECT_LOCATION=/new/project/location
```
	4.	Save and close the file:
	•	In nano, press CTRL + O, hit Enter to confirm, then press CTRL + X to exit.
	5.	Apply the changes to your current terminal session:
```bash
>>> source ~/.bashrc
```
### Verifying the Change

To verify that the environment variable has been set correctly, run:
```bash
>>> echo $KAFY_PROJECT_LOCATION
```
If everything is correct, this command will output the new project location you set.

# SQL-like Command Interface [CLI Based]
To use the SQL-like command interface, you can execute commands directly from the terminal. The CLI allows you to perform operations such as pretraining, fine-tuning, and summarizing data.
The librarys is pre-installed into your terminal when you run `pip install KafyTraj` for the first time.

There are three main stages of using KAFY.

1. **PreTraining**
2. **FineTuning**
3. **Excution**

## Pretraining:
In this stage the researcher can:
1. Add a New Dataset for Pretraining
You can add a new dataset to the TrajectoryStore to be used for pretraining with all current and future models.
```bash
>>> kafy ADD DATA FROM data_source
```
*Example:*
```bash
>>> kafy ADD DATA FROM 'pretraining_data.csv'
```
2. Add a New Model or Modify Model Configurations
Use available models from HuggingFace’s (HF) Transformers repository by specifying custom configurations.
```bash
>>> kafy ADD MODEL transformer_family FROM HF USING model_config_file.json AS model_name_to_be_saved_as
```
*Example:*
To add a Bert-large model using its configurations:
```bash
>>> kafy ADD MODEL bert FROM HF USING bert_large_configs.json AS bert_large
```
The bert_large model will be added to the `TransformersPlugin` and will be pretrained on all available pretraining datasets in the `TrajectoryStore`.

3. Add a Custom Model [**Not Implemented For Now**]
If the model is not available in HuggingFace, you can define your own.
**Important Note:** Custom models should follow the same structure as HuggingFace models. Refer to the examples in the repository for guidance.
```bash
>>> kafy ADD MODEL transformer_family FROM model_source USING model_config_file.json AS model_name_to_be_saved_as
```
*Example:*
To add a model from a custom model definition:
```bash
>>> kafy ADD MODEL new_family FROM new_model.py USING new_model_configs.json AS new_model
```
The new_model model will be added to the `TransformersPlugin` and will be pretrained on all available pretraining datasets in the `TrajectoryStore`.

## Fine-Tune a Model:

```bash
>>> kafy FINETUNE bert FOR summarization USING my_pretrained_model.pkl WITH finetune_config.json AS my_finetuned_model.pkl
```
## Summarize Data:

```bash
>>> kafy SUMMARIZE FROM requested_data_to_summarize.csv USING my_finetuned_model.pkl
```


## License
This project is licensed under the MIT License. See the LICENSE file for details.

## Contact
For any questions or support, please contact husse408@umn.edu.
