Datasets

The MLCPLDataset class

mlcpl.datasets.MLCPLDataset(*args, **kwds)

A subclass of the torch Dataset for partially labeled multi-label datasets.

mlcpl.datasets.MLCPLDataset.__init__(self, name: str, dataset_path: str, records: ~typing.List[~typing.Tuple], num_categories: int, transform: ~typing.Callable = ToTensor(), categories: ~typing.List[str] | None = None, read_func: ~typing.Callable = <function read_jpg>)

Construct a MLCPLDataset object

Args:

name (str): Dataset name

dataset_path (str): The absolute/relative path of the dataset folder.

records (List[tuple]): In consists of information of samples. Each tuple store a sample’s (id, img_path, list of positive categories, list of negative categories)

num_categories (int): The total number of categories.

transform (Callable): The transform function applied to images.

categories (List[str], optional): Categories’s name. Defaults to None.

read_func (Callable, optional): The function to read an image into a PILLOW Image instance. Defaults to read_jpg.

mlcpl.datasets.MLCPLDataset.summary(self)

print general infromation of the dataset

mlcpl.datasets.MLCPLDataset.__getitem__(self, idx)

loads and returns a sample from the dataset at the given index idx. See https://pytorch.org/tutorials/beginner/basics/data_tutorial.html.

mlcpl.datasets.MLCPLDataset.__to_one_hot(self, pos_categories: List[int], neg_categories: List[int]) Tensor

Produce an one-hot vector based on given categories.

Args:

pos_categories (List[int]): Positive categories

neg_categories (List[int]): Negative categories

Returns:

Tensor: An one-hot tensor with shape (N).

mlcpl.datasets.MLCPLDataset.get_statistics(self)

Return some statistics given the dataset’s informations

Args:

records (List[Tuple]): In consists of information of samples. Each tuple store a sample’s (id, img_path, list of positive categories, list of negative categories)

num_categories (int): The total number of categories

Returns:

Dict: A dict consisting of statistics

mlcpl.datasets.MLCPLDataset.drop_labels_uniform(self, target_label_proportion: float, seed: int = 526)
mlcpl.datasets.MLCPLDataset.drop_labels_single_positive(self, seed: int = 526) List[Tuple]

Only one positive label is retained and all other labels are dropped for each sample.

Args:

seed (int, optional): The random seed. Defaults to 526.

Returns:

Self: self

mlcpl.datasets.MLCPLDataset.drop_labels_fix_per_category(self, max_num_labels_per_category: int, seed: int = 526)

Drop labels with the FPC method. See https://openaccess.thecvf.com/content/CVPR2022/papers/Ben-Baruch_Multi-Label_Classification_With_Partial_Annotations_Using_Class-Aware_Selective_Loss_CVPR_2022_paper.pdf

Args:

max_num_labels_per_category (int): The maximum number of labels for each category

seed (int, optional): The random seed. Defaults to 526.

Returns:

Self: self

Dataset Loaders

mlcpl.datasets.MSCOCO(dataset_path, year='2014', split='train', transform=ToTensor())

Load the MS-COCO dataset.

Args:
dataset_path:

Path to the dataset folder.

year:

The year of split. Defaults to ‘2014’.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.

mlcpl.datasets.Pascal_VOC_2007(dataset_path, split='train', transform=ToTensor())

Load the Pascal VOC 2007 dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.

mlcpl.datasets.LVIS(dataset_path, split='train', transform=ToTensor())

Load the LVIS dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.

mlcpl.datasets.Open_Images_V6(dataset_path, split=None, transform=ToTensor(), use_cache=True, cache_dir='output/dataset')

Load the Open Images v6 dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

use_cache:

Whether saving the loaded metadata to cache. Defaults to True.

cache_dir:

The path to the cache. Defaults to ‘output/dataset’.

Returns:

A MLCPLDataset object.

mlcpl.datasets.Open_Images_V3(dataset_path, split='train', transform=ToTensor(), use_cache=True, cache_dir='output/dataset', check_images=True)

Load the Open Images v3 dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

use_cache:

Whether saving the loaded metadata to cache. Defaults to True.

cache_dir:

The path to the cache. Defaults to ‘output/dataset’.

check_images:

Whether perform a check to detect if each image file in the metadata exists. Defaults to True.

Returns:

A MLCPLDataset object.

mlcpl.datasets.CheXpert(dataset_path, split='train', competition_categories=False, transform=ToTensor())

Load the CheXpert dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

competition_categories:

If True, the returned dataset only consists of 5 categories of the CheXpert competition: Atelectasis, Cardiomegaly, Consolidation, Edema, and Pleural Effusion. Defaults to False.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.

mlcpl.datasets.VAW(dataset_path, vg_dataset_path, split='train', use_cache=True, cache_dir='output/dataset', transform=ToTensor())

Load the VAW dataset.

Args:
dataset_path:

Path to the dataset folder.

vg_dataset_path:

Path to the Visual Gerome dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

use_cache:

Whether saving the loaded metadata to cache. Defaults to True.

cache_dir:

The path to the cache. Defaults to ‘output/dataset’.

Returns:

A MLCPLDataset object.

mlcpl.datasets.NUS_WIDE(dataset_path, split='train', transform=ToTensor())

Load the NUS-WIDE dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.

mlcpl.datasets.VISPR(dataset_path, split='train', transform=ToTensor())

Load the VISPR dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.

mlcpl.datasets.Vireo_Food_172(dataset_path, split='train', transform=ToTensor())

Load the Vireo Food 172 dataset.

Args:
dataset_path:

Path to the dataset folder.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.

mlcpl.datasets.VG_200(dataset_path, metadata_path=None, split='train', transform=ToTensor())

Load the VG-200 dataset.

Args:
dataset_path:

Path to the dataset folder.

metadata_path:

Path to the folder of the metadata file.

split:

The sub-split of the dataset. Defaults to ‘train’.

transform:

Transformation applied to images. Defaults to transforms.ToTensor().

Returns:

A MLCPLDataset object.