Metadata-Version: 2.4
Name: swh.datasets
Version: 2.0.1
Summary: Tooling to generate various datasets from on the Software Heritage archive, based on swh-graph
Author-email: Software Heritage developers <swh-devel@inria.fr>
Project-URL: Homepage, https://gitlab.softwareheritage.org/swh/devel/swh-datasets
Project-URL: Bug Reports, https://gitlab.softwareheritage.org/swh/devel/swh-datasets/-/issues
Project-URL: Funding, https://www.softwareheritage.org/donate
Project-URL: Documentation, https://docs.softwareheritage.org/devel/swh-datasets/
Project-URL: Source, https://gitlab.softwareheritage.org/swh/devel/swh-datasets.git
Classifier: Programming Language :: Python :: 3
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
Classifier: Operating System :: OS Independent
Classifier: Development Status :: 3 - Alpha
Requires-Python: >=3.9
Description-Content-Type: text/x-rst
License-File: LICENSE
License-File: AUTHORS
Requires-Dist: boto3
Requires-Dist: click
Requires-Dist: swh.core[http]>=0.3
Requires-Dist: swh.model>=6.13.0
Requires-Dist: swh.export
Requires-Dist: swh.graph>=7.0.0
Requires-Dist: swh.provenance>=0.3.1
Provides-Extra: luigi
Requires-Dist: datafusion<43.0.0; extra == "luigi"
Requires-Dist: luigi!=3.5.2; extra == "luigi"
Requires-Dist: pyarrow<19.0.0; extra == "luigi"
Requires-Dist: python-magic; extra == "luigi"
Requires-Dist: pyzstd; extra == "luigi"
Requires-Dist: tqdm; extra == "luigi"
Requires-Dist: scancode-toolkit==32.2.1; extra == "luigi"
Requires-Dist: swh.export[luigi]>=v1.2.0; extra == "luigi"
Requires-Dist: swh.graph[luigi]>=7.0.0; extra == "luigi"
Requires-Dist: swh.indexer; extra == "luigi"
Requires-Dist: swh.provenance[luigi]>=0.3.0; extra == "luigi"
Requires-Dist: swh.scheduler; extra == "luigi"
Provides-Extra: testing
Requires-Dist: moto[s3,server]; extra == "testing"
Requires-Dist: pytest>=8.1; extra == "testing"
Requires-Dist: pytest-mock; extra == "testing"
Requires-Dist: pytest-postgresql; extra == "testing"
Requires-Dist: swh.core[testing]>=3.0.0; extra == "testing"
Requires-Dist: boto3-stubs; extra == "testing"
Requires-Dist: botocore-stubs; extra == "testing"
Requires-Dist: pyarrow-stubs; extra == "testing"
Requires-Dist: types-boto3[s3]; extra == "testing"
Requires-Dist: types-psutil; extra == "testing"
Requires-Dist: types-pyyaml; extra == "testing"
Requires-Dist: types-requests; extra == "testing"
Requires-Dist: types-protobuf; extra == "testing"
Requires-Dist: types-tqdm; extra == "testing"
Requires-Dist: grpc-stubs; extra == "testing"
Requires-Dist: datafusion<43.0.0; extra == "testing"
Requires-Dist: luigi!=3.5.2; extra == "testing"
Requires-Dist: pyarrow<19.0.0; extra == "testing"
Requires-Dist: python-magic; extra == "testing"
Requires-Dist: pyzstd; extra == "testing"
Requires-Dist: tqdm; extra == "testing"
Requires-Dist: scancode-toolkit==32.2.1; extra == "testing"
Requires-Dist: swh.export[luigi]>=v1.2.0; extra == "testing"
Requires-Dist: swh.graph[luigi]>=7.0.0; extra == "testing"
Requires-Dist: swh.indexer; extra == "testing"
Requires-Dist: swh.provenance[luigi]>=0.3.0; extra == "testing"
Requires-Dist: swh.scheduler; extra == "testing"
Requires-Dist: swh.storage[testing]; extra == "testing"
Dynamic: license-file

Software Heritage - Derived Datasets
====================================

Tooling to generate various datasets from the `Software Heritage
<https://www.softwareheritage.org/>`_
`archive <https://archive.softwareheritage.org/>`_;
based on the `in-memory compressed graph representation <https://docs.softwareheritage.org/devel/swh-graph/>`_.
