Base (Private) Module: parsers/_pptxbaseparser.py

Purpose:

This module provides generalised base functionality for parsing PPTX documents.

Platform:

Linux/Windows | Python 3.10+

Developer:

J Berendt

Email:

development@s3dev.uk

Attention

This module is not designed to be interacted with directly, only via the appropriate interface class(es).

Rather, please create an instance of a PPTX document parsing object using the following:

class _PPTXBaseParser(path: str)[source]

Bases: object

Base class containing generalised PPTX parsing functionality.

property doc: DocPPTX

Accessor to the document object.

_open() None[source]

Open the PPTX document for reading.

Before opening the file, a test is performed to ensure the PPTX is valid. The file must:

  • exist

  • be a ZIP archive, per the file signature

  • have a .pptx file extension

Other Operations:
  • Store the pptx.Presentation parser object returned from the pptx.Presentation() instance creation into the self._doc._parser attribute.

  • Store the number of pages into the self._doc._npages attribute.

  • Store the document’s meta data into the self._doc._meta attribute.

Raises:
  • TypeError – Raised if the file type criteria above are not

  • met.

_set_paths() None[source]

Set the document’s file path attributes.