Metadata-Version: 2.1
Name: lemonpdf
Version: 1.0rc6
Summary: Python3 library to get urls from PDF files.
Author: zudefoque
Author-email: Juan Bindez <juanbindez780@gmail.com>
License: MIT license
Project-URL: Homepage, https://github.com/juanbindez/pdfurls
Project-URL: Bug Reports, https://github.com/juanbindez/pdfurls/issues
Project-URL: Read the Docs, http://pdfurls.readthedocs.io/
Keywords: PDF,Extractor,cli,tools
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3.7
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python
Classifier: Topic :: Internet
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Terminals
Classifier: Topic :: Utilities
Requires-Python: >=3.7
Description-Content-Type: text/markdown
Requires-Dist: packaging
Requires-Dist: pdf2image
Requires-Dist: pillow
Requires-Dist: PyMuPDF
Requires-Dist: PyMuPDFb
Requires-Dist: pytesseract

# lemonpdf

![PyPI - Downloads](https://img.shields.io/pypi/dm/lemonpdf)
![PyPI - License](https://img.shields.io/pypi/l/lemonpdf)
![GitHub Tag](https://img.shields.io/github/v/tag/JuanBindez/lemonpdf?include_prereleases)
<a href="https://pypi.org/project/lemonpdf/"><img src="https://img.shields.io/pypi/v/lemonpdf" /></a>

### Python3 library to get urls from PDF files.


### Install
    sudo apt install tesseract-ocr poppler-utils
    pip install lemonpdf

### Quickstart


### Command line interface use (CLI)

    lemonpdf file.pdf

#### save file

    lemonpdf file.pdf --output  urls.txt --save

#### scripts

```python

from lemonpdf import Extractor

pdf_path = 'file.pdf'
output_txt_path = 'out_file.txt'

extractor = Extractor(pdf_path=pdf_path, output_txt_path=output_txt_path)

urls = extractor.extract_urls_from_pdf(save=True)

print(urls)


```
