Metrics

For more information, see https://lightning.ai/docs/torchmetrics/v1.5.2/classification/hinge_loss.html#binary-hinge-loss.

mlcpl.metrics.partial_multilabel_average_precision(preds, target, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro', thresholds=None)

Compute the average precision (AP) score.

Args:
preds:

Tensor with predictions

target:

Tensor with true labels

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum score over all labels

  • macro: Calculate score for each label and average them

  • weighted: calculates score for each label and computes weighted average using their support

  • "none" or None: calculates score for each label and applies no reduction

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_auroc(preds, target, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro', thresholds=None)

Compute Area Under the Receiver Operating Characteristic Curve.

Args:

preds: Tensor with predictions

target: Tensor with true labels

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum score over all labels

  • macro: Calculate score for each label and average them

  • weighted: calculates score for each label and computes weighted average using their support

  • "none" or None: calculates score for each label and applies no reduction

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_fbeta_score(preds, target, beta: float = 1.0, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute F_Beta score.

Args:

preds: Tensor with predictions

target: Tensor with true labels

beta: Weighting between precision and recall in calculation. Setting to 1 corresponds to equal weight

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_f1_score(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute F-1 score.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_precision(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute precision.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_recall(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute recall.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_sensitivity(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute sensitivity.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_specificity(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute specificity.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_precision_at_fixed_recall(preds, target, min_recall: float, thresholds=None)

Compute the highest possible precision value given the minimum recall thresholds.

Args:

preds: Tensor with predictions

target: Tensor with true labels

min_recall: float value specifying minimum recall threshold.

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d Tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_recall_at_fixed_precision(preds, target, min_precision: float, thresholds=None)

Compute the highest possible recall value given the minimum precision thresholds.

Args:

preds: Tensor with predictions

target: Tensor with true labels

min_precision: float value specifying minimum precision threshold.

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d Tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_sensitivity_at_specificity(preds, target, min_specificity: float, thresholds=None)

Compute the highest possible sensitivity value given the minimum specificity thresholds.

Args:

preds: Tensor with predictions

target: Tensor with true labels

min_specificity: float value specifying minimum specificity threshold.

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d Tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_specificity_at_sensitivity(preds, target, min_sensitivity: float, thresholds=None)

Compute the highest possible specificity value given the minimum sensitivity thresholds.

Args:

preds: Tensor with predictions

target: Tensor with true labels

min_sensitivity: float value specifying minimum sensitivity threshold.

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d Tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_roc(preds, target, thresholds=None)

Compute the Receiver Operating Characteristic Curves.

Args:

preds: Tensor with predictions

target: Tensor with true labels

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_precision_recall_curve(preds, target, thresholds=None)

Compute the Precision Recall Curves.

Args:

preds: Tensor with predictions

target: Tensor with true labels

thresholds:

Can be one of:

  • If set to None, will use a non-binned approach where thresholds are dynamically calculated from all the data. Most accurate but also most memory consuming approach.

  • If set to an int (larger than 1), will use that number of thresholds linearly spaced from 0 to 1 as bins for the calculation.

  • If set to an list of floats, will use the indicated thresholds in the list as bins for the calculation

  • If set to an 1d tensor of floats, will use the indicated thresholds in the tensor as bins for the calculation.

mlcpl.metrics.partial_multilabel_accuracy(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute accuracy.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_calibration_error(preds, target, n_bins=15, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro', norm: Literal['l1', 'l2', 'max', 'ECE', 'ACE', 'MCE'] = 'l1')

Compute calibration error.

Args:

preds: Tensor with predictions

target: Tensor with true labels

n_bins: Number of bins to use when computing the metric.

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

norm: Norm used to compare empirical and expected probability bins.

mlcpl.metrics.partial_multilabel_expected_calibration_error(preds, target, n_bins=15, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute expected calibration error.

Args:

preds: Tensor with predictions

target: Tensor with true labels

n_bins: Number of bins to use when computing the metric.

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_average_calibration_error(preds, target, n_bins=15, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute average calibration error.

Args:

preds: Tensor with predictions

target: Tensor with true labels

n_bins: Number of bins to use when computing the metric.

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_maximum_calibration_error(preds, target, n_bins=15, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute maximum calibration error.

Args:

preds: Tensor with predictions

target: Tensor with true labels

n_bins: Number of bins to use when computing the metric.

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_cohen_kappa(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro', weights: Literal['linear', 'quadratic', 'none'] = 'none')

Calculate Cohen’s kappa score.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

weights: Weighting type to calculate the score. Choose from:

  • None or 'none': no weighting

  • 'linear': linear weighting

  • 'quadratic': quadratic weighting

mlcpl.metrics.partial_multilabel_confusion_matrix(preds, target, threshold: float = 0.5, normalize: Literal['none', 'true', 'pred', 'all'] = 'none')

Compute the confusion matrix.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

normalize: Normalization mode for confusion matrix. Choose from:

  • None or 'none': no normalization (default)

  • 'true': normalization over the targets (most commonly used)

  • 'pred': normalization over the predictions

  • 'all': normalization over the whole matrix

Returns:

A [num_labels, 2, 2] tensor

mlcpl.metrics.partial_multilabel_dice(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'samples', 'none'] = 'micro')

Compute Dice.

Args:

preds: Predictions from model (probabilities, logits or labels)

target: Ground truth values

zero_division: The value to use for the score if denominator equals zero

average:

Defines the reduction that is applied. Should be one of the following:

  • 'micro' [default]: Calculate the metric globally, across all samples and classes.

  • 'macro': Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).

  • 'weighted': Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn).

  • 'none' or None: Calculate the metric for each class separately, and return the metric for every class.

  • 'samples': Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).

mlcpl.metrics.partial_multilabel_exact_match(preds, target, threshold: float = 0.5)

Compute Exact match (also known as subset accuracy).

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

mlcpl.metrics.partial_multilabel_hamming_distance(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute hamming distance.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_hinge_loss(preds, target, squared: bool = False, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute the mean Hinge loss typically used for Support Vector Machines.

Args:

preds: Tensor with predictions

target: Tensor with true labels

squared:

If True, this will compute the squared hinge loss. Otherwise, computes the regular hinge loss.

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_jaccard_index(preds, target, threshold: float = 0.5, average: Literal['macro', 'micro', 'weighted', 'none'] = 'macro')

Compute jaccard index score.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions

average:

Defines the reduction that is applied over labels. Should be one of the following:

  • micro: Sum statistics over all labels

  • macro: Calculate statistics for each label and average them

  • weighted: calculates statistics for each label and computes weighted average using their support

  • "none" or None: calculates statistic for each label and applies no reduction

mlcpl.metrics.partial_multilabel_ranking_average_precision(preds, target)

Compute ranking average precision.

Args:

preds: Tensor with predictions

target: Tensor with true labels

mlcpl.metrics.partial_multilabel_ranking_loss(preds, target)

Compute multilabel ranking loss.

Args:

preds: Tensor with predictions

target: Tensor with true labels

mlcpl.metrics.partial_multilabel_matthews_corrcoef(preds, target, threshold: float = 0.5)

Compute mattews correlation coefficient.

Args:

preds: Tensor with predictions

target: Tensor with true labels

threshold: Threshold for transforming probability to binary (0,1) predictions