## RAGENTools: Retrieved, Augmentated, Generation (RAG) and AGENT tools.

<div align="center">
  <a href="https://pypi.python.org/pypi/ragentools"><img src="https://img.shields.io/pypi/v/ragentools.svg"></a><br>
  <a href="https://pypi.org/project/ragentools"><img src="https://img.shields.io/pypi/pyversions/ragentools.svg"></a><br>
  Github: <a href="https://github.com/bnbsking/ragentools">source</a> <a href="https://github.com/bnbsking/ragentools"><img src="https://github.com/bnbsking/ragentools/blob/main/pics/github-mark-white.png" width="20" height="20"></a>
  <!--<a href="https://github.com/bnbsking/ragentools"><img src="https://img.shields.io/github/v/release/bnbsking/ragentools"></a><br>-->
</div>

## Motivation

1. **Extended LLM call**
    + Based on `Gemini` and `OpenAI` official API, extend more useful functions include:
    
    | Chat API  | Async      | Retry              | Get token/price | Formatted response | Img Input |
    | -         | -          | -                  | -               | -                  | - |
    | Official  | ⚠️ Wrapper | ❌ Not support    | ⚠️ Hassle       | ✅ Server-side (strong) | ✅ |
    | LangChain | ⚠️ Wrapper | ⚠️ conn. only     | ⚠️ Hassle       | ⚠️ Client-side (medium) | ✅ |
    | Ours      | ✅ Call    | ✅ conn. & format | ✅ .get_price() | ✅ Server-side (strong) | ✅ |
    
    + Also auto batching for embedding api

    | Emb API   | Async      | Retry           | Get token/price | Batching           |
    | -         | -          | -               | -               | -                  |
    | Official  | ⚠️ Wrapper | ❌ Not support | ⚠️ Hassle       | ⚠️ Overflow error |
    | Ours      | ✅ Call    | ✅ connection  | ✅ .get_price() | ✅ Auto address error |

    see implementation details of [Gemini](https://github.com/bnbsking/ragentools/blob/main/ragentools/api_calls/google_gemini.py) and [GPT](https://github.com/bnbsking/ragentools/blob/main/ragentools/api_calls/openai_gpt.py)

2. **Agents** <br>
    + Based on **Extended LLM call** and **LangChain Runnable**, build complex agent by `LangGraph` efficiently.

    | Method      | Node                                   | Pattern | 
    | -           | -                                      | -       |
    | Traditional | Functions                              | Messy and hard to scale-up |
    | LangGraph   | Extended LLM call + LangChain Runnable | Clean code due to [Blackboard Design Pattern](https://en.wikipedia.org/wiki/Blackboard_(design_pattern)) |
    
    + Structure Design <br>
        ![Flow](https://github.com/bnbsking/ragentools/blob/main/pics/agent_design.png)
    
    + Example: Text2Chart agent
        + graph (generated by code) <br>
            ![graph](https://github.com/bnbsking/ragentools/blob/main/pics/agent_graph.png)
        + [Agent Implementation Example](https://github.com/bnbsking/ragentools/blob/main/agents/text2chart/v1/main.py)
        + output folder: `agents/text2chart/v1/save/matplotbench_easy`

3. **RAG**
    + Core
        + Scalabitity, Flexibility for various type of parsers, indexers, retrivers, evaluators
        + example: Embedding for LangChain <br>
            ![lanchain_emb](https://github.com/bnbsking/ragentools/blob/main/pics/emb_design.png)

    + Parsers
        + Tuning chunk size
            + Rule of thumbs: chunk_size=500~1500 characters, chunk_overlap=10%~20%
            + Too fragmented -> Low k@recall or Low context recall -> Need to increase
            + Too much irrelevent -> Low k@precision -> Need to decrease
        + Supported Parsers
            + PDFParser
            + TextParser
    + Indexers
        + Tuning Embedding Dimension
            + Rule of thumbs: ~1k docs -> 2048~3072; ~1M docs -> 1024 ~ 2048
            + Downstream task gradual reduction
        + Supported indexers
            + Two-level indexing by FAISS
                + fine-level: chunk 
                + coarse-level: file
    + Retrievers
        + Supported retrievers
            + Two-level retriever from FAISS
                + fine-level: chunk 
                + coarse-level: file
                + rerank by difference score and concat into a piece of prompt
    + Evaluators
        + Supported evaluators
            + RAGAs <br>
                ![ragas](https://github.com/bnbsking/ragentools/blob/main/pics/ragas.png)
            + Tuning strategy
        
| Metrics        | Method       | Used | Target | Meaning | Tuning |
| -              | -            | -    | -      | -       | -      |
| k@precision    | LLM as judge | Y | Retrieved-Query    | How many chunks are closely related to the query | Reduce chunk size |
| k@recall       | Precompute   | N | Retrieved-Query    | How many chunks for generating this QA is retrieved | - |
| context recall | LLM as judge | Y | Retrieved-GT       | How many chunks are helpful to the GT | increase chunk size, top-k |
| faithfulness   | LLM as judge | Y | Retrieved-Response | How many chunks are helpful to the Response | low context recall: increase chunk size, overlap, top-k<br> high context recall: LLM hallucinates or do not need RAG |
| relevancy      | LLM as judge | Y | Query-Response     | Whether LLM understand query | Enhance LLM or prompt |
| correctness    | LLM as judge | Y | Response-GT        | Overall score | - |


## Installation
```bash
pip instal -e .
```

## Example 1 - Call API

following code with the features
+ 1 formatted response: dictionary followed by Google official configuration.
+ 2 image input: followed by Google official configuration
+ 3 async: call `amain_wrapper(async_func: Callable, arg_list: List[Dict])` to easily harness async program.
+ 4 retry: based on tencaity, customize retry times and intervals.
+ 5 get price: use `.get_price()` easily. Update the price table in [here](https://github.com/bnbsking/ragentools/blob/main/ragentools/api_calls/prices.csv)

```python
import yaml

from ragentools.api_calls.google_gemini import GoogleGeminiChatAPI
from ragentools.api_calls.langchain_runnable import ChatRunnable
from ragentools.common.async_main import amain_wrapper
from ragentools.common.formatting import get_response_model
from ragentools.prompts import get_prompt_and_response_format


api_key = yaml.safe_load(open("/app/tests/api_keys.yaml"))["GOOGLE_API_KEY"]
runnable = ChatRunnable(
    api=GoogleGeminiChatAPI,
    api_key=api_key,
    model_name="gemini-2.0-flash-lite"
)


response_format = {"description": {"type": "string"}}  # 1 formatted response
parts = [
    {"text": "What's in this picture?"},
    {"inline_data": {
        "mime_type": "image/jpeg",
        "data": open("/app/tests/api_calls/dog.jpg", "rb").read()
    }}
]  # 2 image input
results = amain_wrapper(  # 3 async
    self.runnable.arun,
    [
        {
            "input": {
                "prompt": [{"role": "user", "parts": parts}],
                "response_format": response_format,
                "retry_times": 3,  # 4 retry 
                "retry_sec": 5
            }
        }
    ]
)


expect_response_format = get_response_model(response_format)
expect_response_format(**results[0])
print(results)               # 1 formatted response
print(self.runnable.api.get_price())  # 5 get price
```

The outcome will be
```
[{'description': 'A black and white Border Collie is sitting and looking at the camera.'}]
0
```

## Example 2 - Text2Chart agent
+ code
    + Each node
        + Inherits "LangChain Runnable" for graph scalability
        + Has attribute "Extended LLM Call" for api benefits.
```bash
python agents/text2chart/v1/main.py
```

+ graph <br>
    ![graph](https://github.com/bnbsking/ragentools/blob/main/pics/agent_graph.png)

+ prompts are in [here](https://github.com/bnbsking/ragentools/blob/main/ragentools/prompts/text2chart).
For instance, the eval prompt
```yaml
prompt: |
  **Task:** You are an expert in evaluating a diagram generated from code written by a LLM, in response to a user's query.
  Your goal is to assess how accurately the diagram fullfills the user's intent.

  **Evaluation Scope:**
  Focus only on aspects that directly relate to the accuracy and informativeness of the diagram, as determined by the user's query.
  Ignore stylistic features such as color schemes, font styles, line thickness, or point markers, etc.

  **Evaluation Criteria:**
  Assess the diagram using the following four criteria. For each, select a score from the scale provided.
  1. Representative:
      Is the chosen diagram type (e.g., bar chart, line chart, scatter plot, pie chart) appropriate for visualizing the data and answering the user's query?
      **Scale**
      - 0 (Barely representative)
      - 1 (Partially representative)
      - 2 (Mostly representative)
  2. Data consistency:
      Do the data values shown in the diagram match what is implied or explicitly described in the user's query?
      If the query does not mention specific data values or ranges, consider it consistent.
      **Scale**
      - 0 (Barely consistent)
      - 1 (Partially consistent)
      - 2 (Mostly consistent)
  3. Scale correctness:
      Are the axes' scales (e.g. range, units, intervals) appropriate and correct based on the user's query?
      If the query does not specify scales or if the diagram does not require them (e.g. pie charts), consider it correct.
      - 0 (Barely correct)
      - 1 (Partially correct)
      - 2 (Mostly correct)
  4. Label accuracy:
      Are the diagram title, axes labels, legends, and other textual annotations accurate with respect to the variables or categories specified in the query?
      Are any key components missing?
      - 0 (Barely accurate)
      - 1 (Partially accurate)
      - 2 (Mostly accurate)

  **Query:**  {{ query }}

  **Response:** Provide a structured JSON.

default_replacements: {}

response_format:
  representative:
    type: integer
  data_consistency:
    type: integer
  scale_correctness:
    type: integer
  label_accuracy:
    type: integer
  explanation:
    type: string
```

+ config is in [here](https://github.com/bnbsking/ragentools/blob/main/agents/text2chart/v1/agents_text2chart_v1.yaml)
```yaml
api:
  api_key_path: /app/tests/api_keys.yaml
  api_key_env: GOOGLE_API_KEY
  model_name: gemini-2.0-flash-lite

mode: PLOT  # PLOT or RUN
data_path: /app/agents/text2chart/data/matplotbench_easy/data.json
save_folder: /app/agents/text2chart/v1/save/matplotbench_easy/

prompts:
  gen_path: /app/ragentools/prompts/text2chart/gen.yaml
  fix_path: /app/ragentools/prompts/text2chart/fix.yaml
  eval_path: /app/ragentools/prompts/text2chart/eval.yaml
  refine_path: /app/ragentools/prompts/text2chart/refine.yaml
```

+ dataset is in [here](https://github.com/bnbsking/ragentools/blob/main/agents/text2chart/data/matplotbench_easy/data.json).<br>
```json
[
    {
        "instruction": "Create a pie chart:\n\nThe pie chart represents the distribution of fruits in a basket, with the proportions being 35% apples, 45% oranges, and 20% bananas",
        "id": 5
    },
    {
        "instruction": "Generate a Python script using matplotlib to create a 4x4 inch figure that plots a line based on array 'x' from 0.0 to 10.0 (step 0.02) against 'y' which is sine(3pix). Set the x-axis limit from -2 to 10 and the y-axis limit from -6 to 6.",
        "id": 9
    },
    {
        "instruction": "Could you assist me in creating a Python script that generates a plot with the following specifications?\n\n1. The plot should contain three lines. The first line should represent the square of a numerical sequence ranging from 0.0 to 3.0 in increments of 0.02. The second line should represent the cosine of '3*pi' times the same sequence. The third line should represent the product of the square of the sequence and the cosine of '3*pi' times the sequence.\n\n2. The plot should have a legend, labeling the first line as 'square', second line as 'oscillatory' and the third line as 'damped'.\n\n3. The x-axis should be labeled as 'time' and the y-axis as 'amplitude'. The title of the plot should be 'Damped oscillation'.\n\nCould you help me with this?\"",
        "id": 10
    }
]
```

+ output folder: `agents/text2chart/v1/save/matplotbench_easy`
    + example of "id=5" data
        + plot: <br>
            ![id5](https://github.com/bnbsking/ragentools/blob/main/pics/agent_result.png)
        + eval:
```json
{
    "representative": 2,
    "data_consistency": 2,
    "scale_correctness": 2,
    "label_accuracy": 2,
    "explanation": "The pie chart accurately represents the fruit distribution with correct proportions and labels."
}
```

## Example 3 - RAG

+ Overview flow <br>
![two_level_rag](https://github.com/bnbsking/ragentools/blob/main/pics/two_level_rag.png)
+ Full example of [parsing with indexing](https://github.com/bnbsking/ragentools/blob/main/rags/papers/v1/indexing.py) and [retrieving](https://github.com/bnbsking/ragentools/blob/main/rags/papers/v1/retrieving.py)

#### Parsing and Indexing
```python
import glob
import os

import yaml

from ragentools.api_calls.google_gemini import (
    GoogleGeminiEmbeddingAPI,
    GoogleGeminiChatAPI,
)
from ragentools.indexers.embedding import CustomEmbedding
from ragentools.indexers.indexers import two_level_indexing
from ragentools.parsers.pdf_parser import PDFParser


if __name__ == "__main__":
    cfg = yaml.safe_load(open("/app/rags/papers/v1/rags_papers_v1.yaml"))
    cfg_api = cfg["api"]
    cfg_ind = cfg["indexing"]

    api_key = yaml.safe_load(open(cfg_api["api_key_path"]))[cfg_api["api_key_env"]]
    api_emb = GoogleGeminiEmbeddingAPI(api_key=api_key, model_name=cfg_api["emb_model_name"])
    api_chat = GoogleGeminiChatAPI(api_key=api_key, model_name=cfg_api["chat_model_name"])
    embed_model = CustomEmbedding(api=api_emb, dim=3072)

    # Parsing
    parser = PDFParser(
        input_path_list=glob.glob(cfg_ind["data_folder"] + "*.pdf"),
        output_folder=os.path.join(cfg_ind["parsed_save_folder"])
    )
    parser.parse()

    # Indexing
    two_level_indexing(
        parsed_csv_folder=cfg_ind["parsed_save_folder"],
        indices_save_folder=cfg_ind["indices_save_folder"],
        embed_model=embed_model,
        api_chat=api_chat
    )
```

+ input: List of pdf path
+ output:
    + csv for each pdf. example:
    
    | chunk                      | source_path      | page |
    | -                          | -                | -    |
    | Hi There! Nice to meet you | /path/to/doc.pdf | 7    |

    + faiss indices
        + *.faiss

#### GenQA

+ code
```python
import glob
import json
import os

import pandas as pd
import yaml

from ragentools.api_calls.google_gemini import GoogleGeminiChatAPI
from ragentools.genqa.genqa import generate_qa_pairs


if __name__ == "__main__":
    cfg = yaml.safe_load(open("/app/rags/papers/v1/rags_papers_v1.yaml"))
    cfg_api = cfg["api"]
    cfg_ind = cfg["indexing"]
    cfg_qa = cfg["gen_qa"]

    api_key = yaml.safe_load(open(cfg_api["api_key_path"]))[cfg_api["api_key_env"]]
    api_chat = GoogleGeminiChatAPI(api_key=api_key, model_name=cfg_api["chat_model_name"])
    
    generate_qa_pairs(
        prompt_path=cfg_qa["prompt_path"],
        csv_folder=cfg_ind["parsed_save_folder"],
        sample_each_csv=cfg_qa["sample_each_csv"],
        api_chat=api_chat,
        save_path=cfg_qa["save_path"],
    )
```

+ output format:
```json
[
    {
        "question": "What did Elara say the carvings on the archway were?",
        "answer": "Elara said the carvings were wards.",
        "source_path": "/app/rags/papers/data/story.pdf",
        "page": 1
    },
    ...
]
```


#### Retrieving and Answering
```python
import json
import os
import yaml

from ragentools.api_calls.google_gemini import (
    GoogleGeminiEmbeddingAPI,
    GoogleGeminiChatAPI
)
from ragentools.indexers.embedding import CustomEmbedding
from ragentools.retrievers.retrievers import TwoLevelRetriever


if __name__ == "__main__":
    cfg = yaml.safe_load(open("/app/rags/papers/v1/rags_papers_v1.yaml"))
    cfg_api = cfg["api"]
    cfg_ind = cfg["indexing"]
    cfg_qa = cfg["gen_qa"]
    cfg_ans = cfg["answering"]

    # Init API
    api_key = yaml.safe_load(open(cfg_api["api_key_path"]))[cfg_api["api_key_env"]]
    api_emb = GoogleGeminiEmbeddingAPI(api_key=api_key, model_name=cfg_api["emb_model_name"])
    api_chat = GoogleGeminiChatAPI(api_key=api_key, model_name=cfg_api["chat_model_name"])
    embed_model = CustomEmbedding(api=api_emb, dim=3072)

    # Load two-level retriever
    retriever = TwoLevelRetriever(
        embed_model=embed_model,
        fine_index_folder=cfg_ind["indices_save_folder"],
        coarse_index_path=os.path.join(cfg_ind["indices_save_folder"], "coarse_grained_index.faiss")
    )

    # Query
    data_list = json.load(open(cfg_qa["save_path"], 'r', encoding='utf-8'))
    for i, data in enumerate(data_list):
        question = data["question"]
        retrieved_chunks = retriever.query(question)
        retrieved_text = retriever.chunks_concat(retrieved_chunks)
        answer = api_chat.run(
            prompt=f"""Use the following RAG retrieved chunks to answer the question.
                Chunks: {retrieved_text}
                Question: {question}
            """,
            retry_sec=20,
        )
        data_list[i]["llm_response"] = answer
        data_list[i]["retrieved_chunks"] = retrieved_chunks
    os.makedirs(os.path.dirname(cfg_ans["save_path"]), exist_ok=True)
    json.dump(data_list, open(cfg_ans["save_path"], 'w', encoding='utf-8'), ensure_ascii=False, indent=4)

```

+ The output format is
```json
[
    {
        "question": "What are the key areas that medicine focuses on to ensure well-being?",
        "answer": "Medicine focuses on diagnosing, treating, and preventing disease and injury, as well as maintaining and promoting overall health.",
        "source_path": "/app/rags/papers/data/medicine.pdf",
        "page": 1,
        "llm_response": "Based on the retrieved chunks, medicine focuses on the following key areas to ensure well-being:\n\n*   Diagnosing, treating, and preventing disease and injury.\n*   Maintaining and promoting overall health.\n*   Continuous learning, research, and clinical practice to improve the quality and longevity of life and alleviate suffering.\n*   Ethical considerations, including respect for patient autonomy, beneficence, non-maleficence, and justice.\n*   Preventive measures through vaccination, health education, lifestyle interventions, and public health initiatives."
    },
    ...
]
```

+ The retrieved_text is as:
```txt
Chunk 1 with score 0.0963:
jungle seemed to grow silent. Even the rain softened as if the forest itself was 
holding its breath. Through the mist, they saw it—a massive stone archway, carved with 
symbols that glowed faintly under the stormy sky. The entrance to Seraphel. 
Elara stepped forward, tracing her fingers over the carvings. “These are wards,” she 
whispered. “To keep intruders out… or to trap them in.” 
Ryn grunted. “Well, we’re already in. Might as well see what’s inside.” 
They entered cautiously. Inside, the air was thick with the scent of moss and decay, and 
shadows danced along walls that seemed impossibly tall. The city stretched before them: 
towering spires of stone, intricate bridges over chasms, and waterfalls cascading from cliffs 
into misty abysses. 
Kael’s eyes were drawn to the center of th

==========
Chunk 2 with score 0.097:
nd waterfalls cascading from cliffs 
into misty abysses. 
Kael’s eyes were drawn to the center of the city, where a massive temple rose, its roof 
adorned with a symbol of a sun encircled by serpents. “That’s our destination,” he said. “The 
Heart of Seraphel. Whatever is there… it’s what we came for.” 
The streets of the city were eerily empty, save for the occasional echo of footsteps that were 
not theirs. Strange creatures lurked in the shadows: serpentine beings with glowing eyes, 
and birds with feathers like shards of crystal. They seemed harmless at first, but the sense of 
being watched never left. 
As they approached the temple, a low rumble shook the ground. The doors of the temple, 
carved from obsidian, slowly began to open as if acknowledging their arrival. Inside, a vast 
ha
...
```
+ scores means the difference between the chunk and the query


#### Evaluation
```python
import yaml
from ragentools.api_calls.google_gemini import GoogleGeminiChatAPI
from ragentools.evaluators.evaluators import RAGAsEvaluator

if __name__ == "__main__":
    cfg = yaml.safe_load(open("/app/rags/papers/v1/rags_papers_v1.yaml"))
    cfg_api = cfg["api"]
    cfg_ans = cfg["answering"]
    cfg_eval = cfg["eval"]

    api_key = yaml.safe_load(open(cfg_api["api_key_path"]))[cfg_api["api_key_env"]]
    api_chat = GoogleGeminiChatAPI(api_key=api_key, model_name=cfg_api["chat_model_name"])
    
    evaluator = RAGAsEvaluator(
        load_path=cfg_ans["save_path"],
        save_folder=cfg_eval["save_folder"],
        api=api_chat,
    )
    evaluator.evaluate()
```

+ output format

each data

```json
[
    {
        "question": "What are the key areas that medicine focuses on to ensure well-being?",
        "answer": "Medicine focuses on diagnosing, treating, and preventing disease and injury, as well as maintaining and promoting overall health.",
        "source_path": "/app/rags/papers/data/medicine.pdf",
        "page": 1,
        "llm_response": "Medicine focuses on diagnosing, treating, and preventing disease and injury, as well as maintaining and promoting overall health. It aims to improve the quality and longevity of life and alleviate suffering through continuous learning, research, and clinical practice. Key areas include clinical medicine, preventive medicine, pharmacology, surgery, and pathology.\n",
        "retrieved_text": ...,
        "eval": {
            "answer_correctness": {
                "score": 5,
                "reason": "The response is fully correct and semantically equivalent to the ground truth. The additional information is consistent and does not contradict the ground truth."
            },
            "answer_relevancy": {
                "score": 5,
                "reason": "The response directly answers the question by listing key areas of medicine that ensure well-being, such as diagnosing, treating, and preventing disease."
            },
            "context_precision": {
                "score": 5,
                "reason": "The retrieved text focuses specifically on the key areas of medicine related to ensuring well-being, such as disease prevention, treatment, and health promotion."
            },
            "context_recall": {
                "score": 5,
                "reason": "The retrieved text fully encompasses the ground truth answer, covering diagnosis, treatment, prevention, and health maintenance."
            },
            "faithfulness": {
                "score": 5,
                "reason": "The response accurately summarizes the retrieved text, focusing on the definition, goals, and key areas of medicine without introducing any unsupported information or contradictions."
            }
        }
    },
    ...
]
```
and all data
```json
{
    "answer_correctness": 5.0,
    "answer_relevancy": 5.0,
    "context_precision": 5.0,
    "context_recall": 3.0,
    "faithfulness": 5.0
}
```

#### LLM advice
```python
import json

import yaml

from ragentools.api_calls.google_gemini import GoogleGeminiChatAPI
from ragentools.prompts import get_prompt_and_response_format


if __name__ == "__main__":
    cfg = yaml.safe_load(open("/app/rags/papers/v1/rags_papers_v1.yaml"))
    cfg_api = cfg["api"]
    cfg_eval = cfg["eval"]

    api_key = yaml.safe_load(open(cfg_api["api_key_path"]))[cfg_api["api_key_env"]]
    api_chat = GoogleGeminiChatAPI(api_key=api_key, model_name=cfg_api["chat_model_name"])
    
    prompt, response_format = get_prompt_and_response_format(
        "/app/ragentools/prompts/ragas/advisor.yaml"
    )
    avg_score_dict = json.load(open("/app/rags/papers/v1/eval/avg_score.json"))
    response = api_chat.run(
        prompt=prompt.replace("{{ avg_score_dict }}", str(avg_score_dict)),
        response_format=response_format
    )
    with open(f"{cfg_eval['save_folder']}/advises.txt", 'w', encoding='utf-8') as f:
        f.write(response)
    print(response)
```


#### Put everything together
+ See the [folder](https://github.com/bnbsking/ragentools/blob/main//app/rags/papers/v1)
```bash
python /app/rags/papers/v1/indexing.py  # in: data; out: parsed + indices
python /app/rags/papers/v1/genqa.py  # in: parsed; out: genqa
python /app/rags/papers/v1/answering.py  # in: genqa + indices; out: answers
python /app/rags/papers/v1/eval.py  # in: answers; out: eval
python /app/rags/papers/v1/advisor.py  # in: eval; out: eval
```