Metadata-Version: 2.3
Name: airbyte-cdk
Version: 6.26.0
Summary: A framework for writing Airbyte Connectors.
License: MIT
Keywords: airbyte,connector-development-kit,cdk
Author: Airbyte
Author-email: contact@airbyte.io
Requires-Python: >=3.10,<3.13
Classifier: Development Status :: 3 - Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Provides-Extra: file-based
Provides-Extra: sql
Provides-Extra: vector-db-based
Requires-Dist: Jinja2 (>=3.1.2,<3.2.0)
Requires-Dist: PyYAML (>=6.0.1,<7.0.0)
Requires-Dist: Unidecode (>=1.3,<2.0)
Requires-Dist: airbyte-protocol-models-dataclasses (>=0.14,<0.15)
Requires-Dist: avro (>=1.11.2,<1.13.0) ; extra == "file-based"
Requires-Dist: backoff
Requires-Dist: cachetools
Requires-Dist: cohere (==4.21) ; extra == "vector-db-based"
Requires-Dist: cryptography (>=42.0.5,<44.0.0)
Requires-Dist: dpath (>=2.1.6,<3.0.0)
Requires-Dist: dunamai (>=1.22.0,<2.0.0)
Requires-Dist: fastavro (>=1.8.0,<1.9.0) ; extra == "file-based"
Requires-Dist: genson (==1.3.0)
Requires-Dist: isodate (>=0.6.1,<0.7.0)
Requires-Dist: jsonref (>=0.2,<0.3)
Requires-Dist: jsonschema (>=4.17.3,<4.18.0)
Requires-Dist: langchain (==0.1.16) ; extra == "vector-db-based"
Requires-Dist: langchain_core (==0.1.42)
Requires-Dist: markdown ; extra == "file-based"
Requires-Dist: nltk (==3.9.1)
Requires-Dist: numpy (<2)
Requires-Dist: openai[embeddings] (==0.27.9) ; extra == "vector-db-based"
Requires-Dist: orjson (>=3.10.7,<4.0.0)
Requires-Dist: pandas (==2.2.2)
Requires-Dist: pdf2image (==1.16.3) ; extra == "file-based"
Requires-Dist: pdfminer.six (==20221105) ; extra == "file-based"
Requires-Dist: pendulum (<3.0.0)
Requires-Dist: psutil (==6.1.0)
Requires-Dist: pyarrow (>=15.0.0,<15.1.0) ; extra == "file-based"
Requires-Dist: pydantic (>=2.7,<3.0)
Requires-Dist: pyjwt (>=2.8.0,<3.0.0)
Requires-Dist: pyrate-limiter (>=3.1.0,<3.2.0)
Requires-Dist: pytesseract (==0.3.10) ; extra == "file-based"
Requires-Dist: python-calamine (==0.2.3) ; extra == "file-based"
Requires-Dist: python-dateutil
Requires-Dist: python-snappy (==0.7.3) ; extra == "file-based"
Requires-Dist: python-ulid (>=3.0.0,<4.0.0)
Requires-Dist: pytz (==2024.2)
Requires-Dist: rapidfuzz (>=3.10.1,<4.0.0)
Requires-Dist: requests
Requires-Dist: requests_cache
Requires-Dist: serpyco-rs (>=1.10.2,<2.0.0)
Requires-Dist: sqlalchemy (>=2.0,<3.0,!=2.0.36) ; extra == "sql"
Requires-Dist: tiktoken (==0.8.0) ; extra == "vector-db-based"
Requires-Dist: unstructured.pytesseract (>=0.3.12) ; extra == "file-based"
Requires-Dist: unstructured[docx,pptx] (==0.10.27) ; extra == "file-based"
Requires-Dist: wcmatch (==10.0)
Requires-Dist: xmltodict (>=0.13,<0.15)
Project-URL: Documentation, https://docs.airbyte.io/
Project-URL: Homepage, https://airbyte.com
Project-URL: Repository, https://github.com/airbytehq/airbyte-python-cdk
Description-Content-Type: text/markdown

# Airbyte Python CDK and Low-Code CDK

Airbyte Python CDK is a framework for building Airbyte API Source Connectors. It provides a set of
classes and helpers that make it easy to build a connector against an HTTP API (REST, GraphQL, etc),
or a generic Python source connector.

## Building Connectors with the CDK

If you're looking to build a connector, we highly recommend that you first
[start with the Connector Builder](https://docs.airbyte.com/connector-development/connector-builder-ui/overview).
It should be enough for 90% connectors out there. For more flexible and complex connectors, use the
[low-code CDK and `SourceDeclarativeManifest`](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview).

For more information on building connectors, please see the [Connector Development](https://docs.airbyte.com/connector-development/) guide on [docs.airbyte.com](https://docs.airbyte.com).

## Python CDK Overview

Airbyte CDK code is within `airbyte_cdk` directory. Here's a high level overview of what's inside:

- `airbyte_cdk/connector_builder`. Internal wrapper that helps the Connector Builder platform run a declarative manifest (low-code connector). You should not use this code directly. If you need to run a `SourceDeclarativeManifest`, take a look at [`source-declarative-manifest`](https://github.com/airbytehq/airbyte/tree/master/airbyte-integrations/connectors/source-declarative-manifest) connector implementation instead.
- `airbyte_cdk/cli/source_declarative_manifest`. This module defines the `source-declarative-manifest` (aka "SDM") connector execution logic and associated CLI.
- `airbyte_cdk/destinations`. Basic Destination connector support! If you're building a Destination connector in Python, try that. Some of our vector DB destinations like `destination-pinecone` are using that code.
- `airbyte_cdk/models` expose `airbyte_protocol.models` as a part of `airbyte_cdk` package.
- `airbyte_cdk/sources/concurrent_source` is the Concurrent CDK implementation. It supports reading data from streams concurrently per slice / partition, useful for connectors with high throughput and high number of records.
- `airbyte_cdk/sources/declarative` is the low-code CDK. It works on top of Airbyte Python CDK, but provides a declarative manifest language to define streams, operations, etc. This makes it easier to build connectors without writing Python code.
- `airbyte_cdk/sources/file_based` is the CDK for file-based sources. Examples include S3, Azure, GCS, etc.

## Contributing

For instructions on how to contribute, please see our [Contributing Guide](docs/CONTRIBUTING.md).

## Release Management

Please see the [Release Management](docs/RELEASES.md) guide for information on how to perform releases and pre-releases.

