Module gamslib.objectcsv.utils
Utility functions for the objectcsv module.
Provides helpers for finding object folders, extracting titles from TEI and LIDO files, and splitting CSV entries into lists.
Functions
def extract_title_from_lido(lido_file)-
Expand source code
def extract_title_from_lido(lido_file): """ Extract the title from a LIDO file. Args: lido_file (Path or str): Path to the LIDO XML file. Returns: str: Title extracted from the LIDO file, or an empty string if not found. """ lido = ET.parse(lido_file) # pylint: disable=line-too-long title_node = lido.find( "lido:descriptiveMetadata/lido:objectIdentificationWrap/lido:titleWrap/lido:titleSet/lido:appellationValue", namespaces=NAMESPACES, ) return title_node.text if title_node is not None else ""Extract the title from a LIDO file.
Args
lido_file:Pathorstr- Path to the LIDO XML file.
Returns
str- Title extracted from the LIDO file, or an empty string if not found.
def extract_title_from_tei(tei_file)-
Expand source code
def extract_title_from_tei(tei_file): """ Extract the title from a TEI file. Args: tei_file (Path or str): Path to the TEI XML file. Returns: str: Title extracted from the TEI file, or an empty string if not found. """ tei = ET.parse(tei_file) title_node = tei.find( "tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:title", namespaces=NAMESPACES ) return title_node.text if title_node is not None else ""Extract the title from a TEI file.
Args
tei_file:Pathorstr- Path to the TEI XML file.
Returns
str- Title extracted from the TEI file, or an empty string if not found.
def find_object_folders(root_directory: pathlib._local.Path) ‑> Generator[pathlib._local.Path, None, None]-
Expand source code
def find_object_folders(root_directory: Path) -> Generator[Path, None, None]: """ Find all object folders below root_directory that contain a DC.xml file. Args: root_directory (Path): Root directory to search for object folders. Yields: Path: Path to each object folder containing a DC.xml file. Notes: - Skips directories that do not contain a DC.xml file and issues a warning. """ for directory in root_directory.rglob("*"): if directory.is_dir(): if "DC.xml" in [f.name for f in directory.iterdir()]: yield directory else: warnings.warn( f"Skipping '{directory}' as folder does not contain a DC.xml file.", UserWarning, )Find all object folders below root_directory that contain a DC.xml file.
Args
root_directory:Path- Root directory to search for object folders.
Yields
Path- Path to each object folder containing a DC.xml file.
Notes
- Skips directories that do not contain a DC.xml file and issues a warning.
def split_entry(entry: str) ‑> list[str]-
Expand source code
def split_entry(entry: str) -> list[str]: """ Split a string of CSV entries into a list using semicolon as delimiter. Args: entry (str): String containing CSV entries separated by semicolons. Returns: list[str]: List of trimmed entries. Returns an empty list if entry is empty. Notes: - Leading and trailing whitespace is removed from each entry. - Only non-empty entries are included in the result. """ values = entry.split(";") if entry else [] return [value.strip() for value in values if value.strip()]Split a string of CSV entries into a list using semicolon as delimiter.
Args
entry:str- String containing CSV entries separated by semicolons.
Returns
list[str]- List of trimmed entries. Returns an empty list if entry is empty.
Notes
- Leading and trailing whitespace is removed from each entry.
- Only non-empty entries are included in the result.