Module gamslib.formatdetect.xmltypes

Functions and data for detecting XML file types and subtypes.

Provides utilities to identify XML formats based on MIME types and XML namespaces. Maps supported subtypes to MIME types and offers helpers for format detection.

Functions

def get_format_info(filepath: pathlib._local.Path, mime_type: str) ‑> tuple[str, enum.StrEnum | None]
Expand source code
def get_format_info(filepath: Path, mime_type: str) -> tuple[str, StrEnum | None]:
    """
    Get the format info for an XML file, including fixed MIME type and detected subtype.

    Args:
        filepath (Path): Path to the XML file.
        mime_type (str): MIME type detected by another tool.

    Returns:
        tuple[str, StrEnum | None]: (MIME type, detected subtype) for the file.

    Notes:
        - If the subtype cannot be detected, returns the original MIME type and None.
        - If detected, returns the mapped MIME type and subtype.
    """
    xmltype = guess_xml_subtype(filepath)
    if xmltype is None:
        subtype = None
    else:
        subtype = xmltype
        mime_type = MIMETYPES.get(xmltype, mime_type)
    return mime_type, subtype

Get the format info for an XML file, including fixed MIME type and detected subtype.

Args

filepath : Path
Path to the XML file.
mime_type : str
MIME type detected by another tool.

Returns

tuple[str, StrEnum | None]
(MIME type, detected subtype) for the file.

Notes

  • If the subtype cannot be detected, returns the original MIME type and None.
  • If detected, returns the mapped MIME type and subtype.
def guess_xml_subtype(filepath: pathlib._local.Path) ‑> str
Expand source code
def guess_xml_subtype(filepath: Path) -> str:
    """
    Guess the XML subtype of a file by inspecting its namespaces.

    Iterates through the XML file and checks for known namespaces to determine the subtype.
    If the namespace is not recognized, a warning is issued and None is returned.

    Args:
        filepath (Path): Path to the XML file.

    Returns:
        str: SubType value if detected, otherwise None.

    Notes:
        - Useful for simple detectors or exotic formats.
        - Tools like FITS may also detect subtypes, but this function is for custom logic.
    """
    for _, elem in ET.iterparse(filepath, events=["start-ns"]):
        namespace = elem[1]
        try:
            return NAMESPACES[namespace]
        except KeyError:
            warnings.warn(
                f"XML format detection failed due to unknown namespace: {namespace}"
            )
    return None

Guess the XML subtype of a file by inspecting its namespaces.

Iterates through the XML file and checks for known namespaces to determine the subtype. If the namespace is not recognized, a warning is issued and None is returned.

Args

filepath : Path
Path to the XML file.

Returns

str
SubType value if detected, otherwise None.

Notes

  • Useful for simple detectors or exotic formats.
  • Tools like FITS may also detect subtypes, but this function is for custom logic.
def is_xml_type(mimetype: str) ‑> enum.StrEnum | None
Expand source code
def is_xml_type(mimetype: str) -> StrEnum | None:
    """
    Check if a MIME type is recognized as an XML type.

    Args:
        mimetype (str): MIME type to check.

    Returns:
        StrEnum | None: True if the MIME type is a known XML type, otherwise None.
    """
    return mimetype in MIMETYPES.values() or mimetype in XML_MIME_TYPES

Check if a MIME type is recognized as an XML type.

Args

mimetype : str
MIME type to check.

Returns

StrEnum | None
True if the MIME type is a known XML type, otherwise None.