Dataset Utils#

Dataset utilities provide file handlers, target (or label) transforms, and a dataset wrapper for datasets.

File Handlers#

File handlers are used to load and save files and are specified using shorthand syntax in the dataset and preprocessing configurations.

Tip

To create custom file handlers, refer to the custom file handlers tutorial.

class autrainer.datasets.utils.AbstractFileHandler[source]#

Abstract file handler for loading files in the dataset and saving files during preprocessing.

Serves as the base for creating custom file handlers that handle loading and saving of different file types.

abstract load(file)[source]#

Load a file from a path.

Parameters:: file (str) – Path to file.
Return type:: Tensor
Returns:: Loaded file.

abstract save(file, data)[source]#

Save a file to a path.

Parameters:

file (str) – Path to file.
data (Tensor) – Data to save.

Return type:

None

class autrainer.datasets.utils.AudioFileHandler(target_sample_rate=None, **kwargs)[source]#

Audio file handler with optional resampling.

Parameters:

target_sample_rate (Optional[int]) – Target sample rate to resample audio files to during loading. Has to be specified to save audio files. If None, audio files are loaded with their original sample rate. Defaults to None.
**kwargs – Additional keyword arguments passed to torchaudio.transforms.Resample.

load(file)[source]#

Load an audio file and resample it if a target sample rate is specified.

Parameters:: file (str) – Path to audio file.
Return type:: Tensor
Returns:: Loaded audio file as a tensor.

save(file, data)[source]#

Save an audio tensor to a file.

Parameters:

file (str) – Path to audio file.
data (Tensor) – Audio data to save.

Raises:

ValueError – If target sample rate is not specified.

Return type:

None

class autrainer.datasets.utils.IdentityFileHandler[source]#

Identity file handler serving as a no-op. Both load and save methods return None.

load(file)[source]#

Identity operation.

Parameters:: file (str) – Path to file.
Return type:: None

save(file, data)[source]#

Identity operation.

Parameters:

file (str) – Path to file.
data (Tensor) – Data to save.

Return type:

None

class autrainer.datasets.utils.ImageFileHandler[source]#

Image file handler for loading and saving with torchvision. Torchvision supports the PNG, JPEG, and GIF image formats for loading and saving images.

load(file)[source]#

Load an image from a file as a uint8 tensor in the range [0, 255].

Parameters:: file (str) – Path to image file.
Return type:: Tensor
Returns:: Uint8 image tensor.

save(file, data)[source]#

Save an image tensor to a file.

If the tensor is of type uint8, it is assumed to be in the range [0, 255] and divided by 255 before saving.

Parameters:

file (str) – Path to image file.
data (Tensor) – Image tensor to save.

Return type:

None

class autrainer.datasets.utils.NumpyFileHandler[source]#

Numpy file handler for loading and saving numpy arrays.

load(file)[source]#

Load a numpy array from a file.

Parameters:: file (str) – Path to numpy file.
Return type:: Tensor
Returns:: Numpy array as a tensor.

save(file, data)[source]#

Save a tensor to a numpy file.

Parameters:

file (str) – Path to numpy file.
data (Tensor) – Tensor to save.

Return type:

None

Target Transforms#

Target transforms are specified using shorthand syntax in the dataset configuration and used to encode as well as decode the targets (or labels) of the dataset. Additionally, target transforms provide functiuons for batch prediction and majority voting.

Tip

To create custom target transforms, refer to the custom target transforms tutorial.

class autrainer.datasets.utils.AbstractTargetTransform[source]#

Abstract target transform for handling target or label transformations in a dataset.

Serves as the base for creating custom target transforms that handle encoding and decoding targets or labels of a dataset, obtaining predictions from a batch of model outputs, and determining the majority vote from a list of targets.

abstract encode(x)[source]#

Encode a target or label.

Parameters:: x (Any) – Target or label.
Return type:: Any
Returns:: Encoded target or label.

abstract decode(x)[source]#

Decode a target or label. Serve as the inverse operation of encode.

Parameters:: x (Any) – Encoded target or label.
Return type:: Any
Returns:: Decoded target or label.

abstract probabilities_training(x)[source]#

Get the encoded probabilities from a batch of model outputs during training.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

abstract probabilities_inference(x)[source]#

Get the encoded probabilities from a batch of model outputs.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

abstract predict_inference(x)[source]#

Get the encoded predictions from a batch of model output probabilities.

Parameters:: x (Tensor) – Batch of model output probabilities.
Return type:: Union[List[Any], Any]
Returns:: Encoded predictions.

abstract majority_vote(x)[source]#

Get the majority vote from a list of decoded targets or labels.

The majority vote is defined by the subclasses and may be the most frequent target, the average of all targets, or any other operation that determines a majority vote based on the list of targets or labels.

Parameters:: x (List[Any]) – List of decoded targets or labels.
Return type:: Any
Returns:: Decoded majority vote.

abstract probabilities_to_dict(x)[source]#

Convert a tensor of probabilities to a dictionary of targets or labels and their probabilities.

Parameters:: x (Tensor) – Tensor of probabilities.
Return type:: Dict[str, float]
Returns:: Dictionary of targets or labels and their probabilities.

class autrainer.datasets.utils.LabelEncoder(labels)[source]#

Label encoder for single-label classification targets.

Parameters:: labels (List[str]) – List of target labels.

encode(x)[source]#

Encode a target label by mapping it to an integer.

Parameters:: x (str) – Target label.
Return type:: int
Returns:: Encoded target label.

decode(x)[source]#

Decode an encoded target label integer by mapping it to a label.

Parameters:: x (int) – Encoded target label.
Return type:: str
Returns:: Decoded target label.

probabilities_training(x)[source]#

Get the encoded probabilities from a batch of model outputs during training by returning the raw model outputs.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

probabilities_inference(x)[source]#

Get the encoded probabilities from a batch of model outputs by applying the softmax function.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

predict_inference(x)[source]#

Get the encoded predictions from a batch of model output probabilities by obtaining the index of the maximum value.

Parameters:: x (Tensor) – Batch of model output probabilities.
Return type:: Union[List[int], int]
Returns:: Encoded predictions.

majority_vote(x)[source]#

Get the majority vote from a list of decoded labels by determining the most frequently predicted label.

Parameters:: x (List[str]) – List of decoded labels.
Return type:: str
Returns:: Decoded majority vote.

probabilities_to_dict(x)[source]#

Convert a tensor of probabilities to a dictionary of labels and their probabilities.

Parameters:: x (Tensor) – Tensor of probabilities.
Return type:: Dict[str, float]
Returns:: Dictionary of labels and their probabilities.

class autrainer.datasets.utils.MinMaxScaler(target, minimum, maximum)[source]#

Minimum-Maximum Scaler for regression targets.

Parameters:

target (str) – Name of the target.
minimum (float) – Minimum value of all target values.
maximum (float) – Maximum value of all target values.

Raises:

ValueError – If minimum is not less than maximum.

encode(x)[source]#

Encode a target value by scaling it between the minimum and maximum.

Parameters:: x (float) – Target value.
Return type:: float
Returns:: Scaled target value.

decode(x)[source]#

Decode a target value by reversing the scaling between the minimum and maximum. Inverse operation of encode.

Parameters:: x (float) – Scaled target value.
Return type:: float
Returns:: Unscaled target value.

probabilities_training(x)[source]#

Get the encoded probabilities from a batch of model outputs during training by applying the sigmoid function.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

probabilities_inference(x)[source]#

Get the encoded probabilities from a batch of model outputs by applying the sigmoid function.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

predict_inference(x)[source]#

Get the encoded predictions from a batch of model output probabilities by returning the raw values.

Parameters:: x (Tensor) – Batch of model output probabilities.
Return type:: Union[List[float], float]
Returns:: Encoded predictions.

majority_vote(x)[source]#

Get the majority vote from a list of target values by averaging the predictions.

Parameters:: x (List[float]) – List of target values.
Return type:: float
Returns:: Average target value.

probabilities_to_dict(x)[source]#

Convert a tensor of probabilities to a dictionary of targets and their probabilities.

Parameters:: x (Tensor) – Tensor of probabilities.
Return type:: Dict[str, float]
Returns:: Dictionary of targets and their probabilities.

class autrainer.datasets.utils.MultiLabelEncoder(threshold, labels)[source]#

Multi-label encoder for multi-label classification targets.

Parameters:

threshold (float) – Class-wise prediction threshold.
labels (List[str]) – List of target labels.

Raises:

ValueError – If threshold is not between 0 and 1.

encode(x)[source]#

Encode a list of target labels by creating a binary tensor where each element is 1 if the label is in the list of target labels and 0 otherwise.

If the input is already a list of integers, it is not encoded.

Parameters:: x (Union[List[int], List[str]]) – List of target labels or list of integers.
Return type:: Tensor
Returns:: Binary tensor of encoded target labels.

decode(x)[source]#

Decode a binary tensor of encoded target labels to a list of target labels.

Parameters:: x (List[int]) – Binary tensor of encoded target labels.
Return type:: List[str]
Returns:: List of target labels.

probabilities_training(x)[source]#

Get the encoded probabilities from a batch of model outputs during training by returning the raw model outputs.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

probabilities_inference(x)[source]#

Get the encoded probabilities from a batch of model outputs by applying the sigmoid function.

Parameters:: x (Tensor) – Batch of model outputs.
Return type:: Tensor
Returns:: Encoded probabilities.

predict_inference(x)[source]#

Get the encoded predictions from a batch of model output probabilities by thresholding the probabilities.

Parameters:: x (Tensor) – Batch of model output probabilities.
Return type:: Union[List[List[int]], List[int]]
Returns:: Binary tensor of encoded predictions.

majority_vote(x)[source]#

Get the majority vote from a list of lists of decoded target labels for each label. If a label is predicted by at least half of the predictions, it is included in the majority vote.

Parameters:: x (List[List[str]]) – List of lists of decoded target labels.
Return type:: List[str]
Returns:: List of target labels in the majority vote.

probabilities_to_dict(x)[source]#

Convert a tensor of probabilities to a dictionary of labels and their probabilities.

Parameters:: x (Tensor) – Tensor of probabilities.
Return type:: Dict[str, float]
Returns:: Dictionary of labels and their probabilities.

Dataset Wrapper#

The DatasetWrapper provides a wrapper around a torch.utils.data.Dataset, utilizing the file handlers and target transforms to load and transform the data, returning the data, target (or label), and index of each sample.

class autrainer.datasets.utils.DatasetWrapper(path, features_subdir, index_column, target_column, file_handler, df, file_type=None, transform=None, target_transform=None)[source]#

Wrapper around torch.utils.data.Dataset.

Parameters:

path (str) – Root path to the dataset.
features_subdir (str) – Subdirectory containing the features.
index_column (str) – Index column of the dataframe.
target_column (Union[str, List[str]]) – Target column of the dataframe.
file_handler (AbstractFileHandler) – File handler to load the data.
df (DataFrame) – Dataframe containing the index and target column(s).
file_type (Optional[str]) – File type of the features. If None, will not enforce a file_type. This can be useful in case the dataset contains audio files with different formats. Defaults to None.
transform (Optional[SmartCompose]) – Transform to apply to the features. Defaults to None.
target_transform (Optional[AbstractTargetTransform]) – Target transform to apply to the target. Defaults to None.

__getitem__(item)[source]#

Get the data item at the specified index.

Parameters:: item (int) – Index of the item.
Return type:: DataItem
Returns:: Data item at the specified index.

ZIP Downloader#

To automatically download and extract zip files, the ZipDownloadManager can be used.

class autrainer.datasets.utils.ZipDownloadManager(files, path, max_threads=4)[source]#

Download and extract zip files.

Parameters:

files (Dict[str, str]) – Dictionary of filenames and URLs.
path (str) – Path to download and extract the files to.
max_threads (int) – Maximum number of threads to use. Defaults to 4.

download(check_exist=None)[source]#

Download all files in files to path.

Skips files if they are already present or all files in check_exist are present.

Parameters:: check_exist (Optional[List[str]]) – List of filenames to check if they already exist. Defaults to None.
Return type:: None

extract(check_exist=None)[source]#

Extract all files in files to path.

Does not extract files if all files in check_exist are present.

Parameters:: check_exist (Optional[List[str]]) – List of filenames to check if they already exist. Defaults to None.
Return type:: None

Table of Contents

Dataset Utils#

File Handlers#

Target Transforms#

Dataset Wrapper#

ZIP Downloader#