Metadata-Version: 2.4
Name: intelligence-toolkit
Version: 0.1.2
Summary: Interactive workflows for generating AI intelligence reports from real-world data sources using GPT models
Author-email: Dayenne Souza <ddesouza@microsoft.com>, Ha Trinh <trinhha@microsoft.com>, Darren Edge <daedge@microsoft.com>
License: MIT
Project-URL: source, https://github.com/microsoft/intelligence-toolkit
Project-URL: issues, https://github.com/microsoft/intelligence-toolkit/issues
Keywords: AI,data analysis,reports,workflows
Requires-Python: <3.13,>=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: altair==4.2.2
Requires-Dist: networkx==3.5
Requires-Dist: numpy==1.26.4
Requires-Dist: openai<2.0.0,>=1.37.1
Requires-Dist: pac-synth==0.0.8
Requires-Dist: plotly==5.22.0
Requires-Dist: plotly-express==0.4.1
Requires-Dist: polars==0.20.10
Requires-Dist: pyarrow==15.0.0
Requires-Dist: pydantic==2.8.2
Requires-Dist: pydantic_core==2.20.1
Requires-Dist: scikit-learn==1.5.1
Requires-Dist: scipy==1.12.0
Requires-Dist: streamlit==1.31.1
Requires-Dist: streamlit-aggrid==0.3.4.post3
Requires-Dist: streamlit-javascript==0.1.5
Requires-Dist: streamlit-agraph==0.0.45
Requires-Dist: tiktoken==0.7.0
Requires-Dist: pdfkit==1.0.0
Requires-Dist: markdown2==2.5.4
Requires-Dist: azure-identity==1.17.1
Requires-Dist: azure-core==1.35.1
Requires-Dist: semchunk==2.2.0
Requires-Dist: lancedb==0.12.0
Requires-Dist: duckdb==1.0.0
Requires-Dist: seaborn==0.13.2
Requires-Dist: textblob==0.19.0
Requires-Dist: jsonschema<5.0.0,>=4.23.0
Requires-Dist: nest-asyncio<2.0.0,>=1.6.0
Requires-Dist: altair-viewer<1.0.0,>=0.4.0
Requires-Dist: torch==2.4.1; sys_platform != "darwin"
Requires-Dist: torch==2.5.1; sys_platform == "darwin"
Requires-Dist: sentence-transformers<4.0.0,>=3.1.1
Requires-Dist: graspologic<4.0.0,>=3.4.1
Requires-Dist: future<2.0.0,>=1.0.0
Requires-Dist: pypdf<6.0.0,>=5.1.0
Requires-Dist: poethepoet<1.0.0,>=0.27.0
Provides-Extra: dev
Requires-Dist: coverage<8.0.0,>=7.6.0; extra == "dev"
Requires-Dist: ruff<1.0.0,>=0.4.7; extra == "dev"
Requires-Dist: pyright<2.0.0,>=1.1.371; extra == "dev"
Requires-Dist: ipykernel<7.0.0,>=6.29.5; extra == "dev"
Requires-Dist: pytest-cov<6.0.0,>=5.0.0; extra == "dev"
Requires-Dist: pytest<9.0.0,>=8.2.2; extra == "dev"
Requires-Dist: pytest-asyncio<1.0.0,>=0.23.7; extra == "dev"
Requires-Dist: pytest-mock<4.0.0,>=3.14.0; extra == "dev"
Requires-Dist: faker<29.0.0,>=28.0.0; extra == "dev"
Requires-Dist: nbformat<6.0.0,>=5.10.4; extra == "dev"
Requires-Dist: setuptools<76.0.0,>=75.3.0; extra == "dev"
Requires-Dist: wheel<1.0.0,>=0.44.0; extra == "dev"
Requires-Dist: twine<6.0.0,>=5.1.1; extra == "dev"
Dynamic: license-file

# Developing 

## Requirements

- Python 3.11 or 3.12 ([Download](https://www.python.org/downloads/))
- uv ([Download](https://docs.astral.sh/uv/getting-started/installation/))
- wkhtmltopdf (used to generate PDF reports)

    - Windows: ([Download](https://wkhtmltopdf.org/downloads.html))

    - Linux:  `sudo apt-get install wkhtmltopdf`

    - MacOS: `brew install homebrew/cask/wkhtmltopdf`


## Running the app

## GPT settings

You can configure your OpenAI access when running the app via `Settings page`, or using environment variables.

#### Default values: 
```
OPENAI_API_MODEL="gpt-4.1-mini"
OPENAI_TYPE="OpenAI" ## Other option available: Azure OpenAI
AZURE_AUTH_TYPE="Azure Key" # if OPENAI_TYPE==Azure OpenAI
DEFAULT_EMBEDDING_MODEL = "text-embedding-3-small"
```

### OpenAI
OPENAI_API_KEY=<OPENAI_API_KEY>

### Azure OpenAI
```
OPENAI_TYPE="Azure OpenAI"
AZURE_OPENAI_VERSION=2023-12-01-preview
AZURE_OPENAI_ENDPOINT="https://<ENDPOINT>.azure.com/"
OPENAI_API_KEY=<AZURE_OPENAI_API_KEY>

#If Azure OpenAI using Managed Identity:
AZURE_AUTH_TYPE="Managed Identity"
```

### Running locally

Windows: Search and open the app `Windows Powershell` on Windows start menu

Linux and Mac: Open `Terminal`

For any OS:

Navigate to the folder where you cloned this repo. 

Use `cd `+ the path to the folder. For example:

`cd C:\Users\user01\projects\intelligence-toolkit`

Run `uv sync --extra dev` and wait for the packages installation.

#### Run the app:

Run `uv run poe run_streamlit`, and it will automatically open the app in your default browser in `localhost:8081`

#### Use the API

You can also replicate the examples in your own environment running `pip install intelligence-toolkit` or `uv add intelligence-toolkit`.

See the documentation and an example of how to run the code with your data to obtain results without the need to run the UI.
- [Anonymize Case Data](./app/workflows/anonymize_case_data/README.md)

    - [Example](./example_notebooks/anonymize_case_data.ipynb)

- [Compare Case Groups](./app/workflows/compare_case_groups/README.md)

    - [Example](./example_notebooks/compare_case_groups.ipynb)

- [Detect Case Patterns](./app/workflows/detect_case_patterns/README.md)

    - [Example](./example_notebooks/detect_case_patterns.ipynb)

- [Detect Entity Networks](./app/workflows/detect_entity_networks/README.md)

    - [Example](./example_notebooks/detect_entity_networks.ipynb)

- [Extract Record Data](./app/workflows/extract_record_data/README.md)

    - [Example](./example_notebooks/extract_record_data.ipynb)

- [Generate Mock Data](./app/workflows/generate_mock_data/README.md)

    - [Example](./example_notebooks/generate_mock_data.ipynb)

- [Match Entity Records](./app/workflows/match_entity_records/README.md)

    - [Example](./example_notebooks/match_entity_records.ipynb)
    
- [Query Text Data](./app/workflows/query_text_data/README.md)

    - [Example](./example_notebooks/query_text_data.ipynb)


### Running with docker

##### Recommended configuration:

- *Minimum disk space*: 8GB 
- *Minimum memory*: 4GB

Download, install and then open docker app: https://www.docker.com/products/docker-desktop/

Then, open a terminal:
Windows: Search and open the app `Windows Powershell` on Windows start menu

Linux and Mac: Open `Terminal`

For any OS:

Navigate to the folder where you cloned this repo. 

Use `cd `+ the path to the folder. For example:

`cd C:\Users\user01\projects\intelligence-toolkit`

Build the container:

`docker build . -t intelligence-toolkit`

Once the build is finished, run the docker container:

- via terminal:

    `docker run -d --name intelligence-toolkit -p 80:80 intelligence-toolkit`

Open [localhost:80](http://localhost:80)

  **Note that docker might sleep and you might need to start it again. Open Docker Desktop, in the left menu click on Container and press play on intelligence-toolkit.**

# Lifecycle Scripts

For Lifecycle scripts it utilizes [uv](https://docs.astral.sh/uv/) and [poethepoet](https://pypi.org/project/poethepoet/) to manage build scripts.

Available scripts are:

- `uv run poe test_unit` - This will execute unit tests on api.
- `uv run poe test_smoke` - This will execute smoke tests on api.
- `uv run poe check` - This will perform a suite of static checks across the package, including:
  - formatting
  - documentation formatting
  - linting
  - security patterns
  - type-checking
- `uv run poe fix` - This will apply any available auto-fixes to the package. Usually this is just formatting fixes.
- `uv run poe fix_unsafe` - This will apply any available auto-fixes to the package, including those that may be unsafe.
- `uv run poe format` - Explicitly run the formatter across the package.

