Dataset Utils#
Dataset utilities provide file handlers, target (or label) transforms, and a dataset wrapper for datasets.
File Handlers#
File handlers are used to load and save files and are specified using shorthand syntax in the dataset and preprocessing configurations.
Tip
To create custom file handlers, refer to the custom file handlers tutorial.
- class autrainer.datasets.utils.AbstractFileHandler[source]#
Abstract file handler for loading files in the dataset and saving files during preprocessing.
Serves as the base for creating custom file handlers that handle loading and saving of different file types.
- class autrainer.datasets.utils.AudioFileHandler(target_sample_rate=None, **kwargs)[source]#
Audio file handler with optional resampling.
- Parameters:
target_sample_rate (
Optional
[int
]) – Target sample rate to resample audio files to during loading. Has to be specified to save audio files. If None, audio files are loaded with their original sample rate. Defaults to None.**kwargs – Additional keyword arguments passed to torchaudio.transforms.Resample.
- class autrainer.datasets.utils.IdentityFileHandler[source]#
Identity file handler serving as a no-op. Both load and save methods return None.
- class autrainer.datasets.utils.ImageFileHandler[source]#
Image file handler for loading and saving with torchvision. Torchvision supports the PNG, JPEG, and GIF image formats for loading and saving images.
Target Transforms#
Target transforms are specified using shorthand syntax in the dataset configuration and used to encode as well as decode the targets (or labels) of the dataset. Additionally, target transforms provide functiuons for batch prediction and majority voting.
Tip
To create custom target transforms, refer to the custom target transforms tutorial.
- class autrainer.datasets.utils.AbstractTargetTransform[source]#
Abstract target transform for handling target or label transformations in a dataset.
Serves as the base for creating custom target transforms that handle encoding and decoding targets or labels of a dataset, obtaining predictions from a batch of model outputs, and determining the majority vote from a list of targets.
- abstract encode(x)[source]#
Encode a target or label.
- Parameters:
x (
Any
) – Target or label.- Return type:
Any
- Returns:
Encoded target or label.
- abstract decode(x)[source]#
Decode a target or label. Serve as the inverse operation of encode.
- Parameters:
x (
Any
) – Encoded target or label.- Return type:
Any
- Returns:
Decoded target or label.
- abstract probabilities_training(x)[source]#
Get the encoded probabilities from a batch of model outputs during training.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- abstract probabilities_inference(x)[source]#
Get the encoded probabilities from a batch of model outputs.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- abstract predict_inference(x)[source]#
Get the encoded predictions from a batch of model output probabilities.
- Parameters:
x (
Tensor
) – Batch of model output probabilities.- Return type:
Union
[List
[Any
],Any
]- Returns:
Encoded predictions.
- abstract majority_vote(x)[source]#
Get the majority vote from a list of decoded targets or labels.
The majority vote is defined by the subclasses and may be the most frequent target, the average of all targets, or any other operation that determines a majority vote based on the list of targets or labels.
- Parameters:
x (
List
[Any
]) – List of decoded targets or labels.- Return type:
Any
- Returns:
Decoded majority vote.
- class autrainer.datasets.utils.LabelEncoder(labels)[source]#
Label encoder for single-label classification targets.
- Parameters:
labels (
List
[str
]) – List of target labels.
- encode(x)[source]#
Encode a target label by mapping it to an integer.
- Parameters:
x (
str
) – Target label.- Return type:
int
- Returns:
Encoded target label.
- decode(x)[source]#
Decode an encoded target label integer by mapping it to a label.
- Parameters:
x (
int
) – Encoded target label.- Return type:
str
- Returns:
Decoded target label.
- probabilities_training(x)[source]#
Get the encoded probabilities from a batch of model outputs during training by returning the raw model outputs.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- probabilities_inference(x)[source]#
Get the encoded probabilities from a batch of model outputs by applying the softmax function.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- predict_inference(x)[source]#
Get the encoded predictions from a batch of model output probabilities by obtaining the index of the maximum value.
- Parameters:
x (
Tensor
) – Batch of model output probabilities.- Return type:
Union
[List
[int
],int
]- Returns:
Encoded predictions.
- class autrainer.datasets.utils.MinMaxScaler(target, minimum, maximum)[source]#
Minimum-Maximum Scaler for regression targets.
- Parameters:
target (
str
) – Name of the target.minimum (
float
) – Minimum value of all target values.maximum (
float
) – Maximum value of all target values.
- Raises:
ValueError – If minimum is not less than maximum.
- encode(x)[source]#
Encode a target value by scaling it between the minimum and maximum.
- Parameters:
x (
float
) – Target value.- Return type:
float
- Returns:
Scaled target value.
- decode(x)[source]#
Decode a target value by reversing the scaling between the minimum and maximum. Inverse operation of encode.
- Parameters:
x (
float
) – Scaled target value.- Return type:
float
- Returns:
Unscaled target value.
- probabilities_training(x)[source]#
Get the encoded probabilities from a batch of model outputs during training by applying the sigmoid function.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- probabilities_inference(x)[source]#
Get the encoded probabilities from a batch of model outputs by applying the sigmoid function.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- predict_inference(x)[source]#
Get the encoded predictions from a batch of model output probabilities by returning the raw values.
- Parameters:
x (
Tensor
) – Batch of model output probabilities.- Return type:
Union
[List
[float
],float
]- Returns:
Encoded predictions.
- class autrainer.datasets.utils.MultiLabelEncoder(threshold, labels)[source]#
Multi-label encoder for multi-label classification targets.
- Parameters:
threshold (
float
) – Class-wise prediction threshold.labels (
List
[str
]) – List of target labels.
- Raises:
ValueError – If threshold is not between 0 and 1.
- encode(x)[source]#
Encode a list of target labels by creating a binary tensor where each element is 1 if the label is in the list of target labels and 0 otherwise.
If the input is already a list of integers, it is not encoded.
- Parameters:
x (
Union
[List
[int
],List
[str
]]) – List of target labels or list of integers.- Return type:
Tensor
- Returns:
Binary tensor of encoded target labels.
- decode(x)[source]#
Decode a binary tensor of encoded target labels to a list of target labels.
- Parameters:
x (
List
[int
]) – Binary tensor of encoded target labels.- Return type:
List
[str
]- Returns:
List of target labels.
- probabilities_training(x)[source]#
Get the encoded probabilities from a batch of model outputs during training by returning the raw model outputs.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- probabilities_inference(x)[source]#
Get the encoded probabilities from a batch of model outputs by applying the sigmoid function.
- Parameters:
x (
Tensor
) – Batch of model outputs.- Return type:
Tensor
- Returns:
Encoded probabilities.
- predict_inference(x)[source]#
Get the encoded predictions from a batch of model output probabilities by thresholding the probabilities.
- Parameters:
x (
Tensor
) – Batch of model output probabilities.- Return type:
Union
[List
[List
[int
]],List
[int
]]- Returns:
Binary tensor of encoded predictions.
- majority_vote(x)[source]#
Get the majority vote from a list of lists of decoded target labels for each label. If a label is predicted by at least half of the predictions, it is included in the majority vote.
- Parameters:
x (
List
[List
[str
]]) – List of lists of decoded target labels.- Return type:
List
[str
]- Returns:
List of target labels in the majority vote.
Dataset Wrapper#
The DatasetWrapper
provides a wrapper around a torch.utils.data.Dataset
, utilizing the file handlers
and target transforms to load and transform the data, returning the data, target (or label), and index of each sample.
- class autrainer.datasets.utils.DatasetWrapper(path, features_subdir, index_column, target_column, file_type, file_handler, df, transform=None, target_transform=None)[source]#
Wrapper around torch.utils.data.Dataset.
- Parameters:
path (
str
) – Root path to the dataset.features_subdir (
str
) – Subdirectory containing the features.index_column (
str
) – Index column of the dataframe.target_column (
Union
[str
,List
[str
]]) – Target column of the dataframe.file_type (
str
) – File type of the features.file_handler (
AbstractFileHandler
) – File handler to load the data.df (
DataFrame
) – Dataframe containing the index and target column(s).transform (
Optional
[SmartCompose
]) – Transform to apply to the features. Defaults to None.target_transform (
Optional
[AbstractTargetTransform
]) – Target transform to apply to the target. Defaults to None.
ZIP Downloader#
To automatically download and extract zip files, the ZipDownloadManager
can be used.
- class autrainer.datasets.utils.ZipDownloadManager(files, path, max_threads=4)[source]#
Download and extract zip files.
- Parameters:
files (
Dict
[str
,str
]) – Dictionary of filenames and URLs.path (
str
) – Path to download and extract the files to.max_threads (
int
) – Maximum number of threads to use. Defaults to 4.