Augmentations#
Augmentations are optional and by default not used.
This is indicated by the absence of the augmentation
attribute in the sweeper configuration (implicitly set to a None
configuration file).
To use an augmentation, specify it in the configuration file (conf/config.yaml
) for the sweeper.
Tip
To create custom augmentations, refer to the custom augmentations tutorial.
Augmentations are specified analogously to transforms using shorthand syntax
and have an order
attribute to define the order of the augmentations.
The augmentations are combined with the transform pipeline and sorted based on the order of the augmentations as well as the transforms.
In addition to the order of the augmentation, a seeded probability p
of applying the augmentation can be specified.
The optional generator_seed
attribute is used to seed the random number generator for the augmentation.
Default Configurations
None
This configuration file is used to indicate that no augmentation is used and serves as a no-op placeholder.
1id: None
2_target_: autrainer.augmentations.AugmentationPipeline
3pipeline: []
Augmentation Pipelines#
The AugmentationManager
is responsible for building the augmentation pipeline.
- class autrainer.augmentations.AugmentationManager(train_augmentation=None, dev_augmentation=None, test_augmentation=None)[source]#
Manage the creation of the augmentation pipelines for train, dev, and test sets.
- Parameters:
train_augmentation (
Union
[DictConfig
,Dict
,None
]) – Train augmentation configuration.dev_augmentation (
Union
[DictConfig
,Dict
,None
]) – Dev augmentation configuration.test_augmentation (
Union
[DictConfig
,Dict
,None
]) – Test augmentation configuration.
- get_augmentations()[source]#
Get augmentation pipelines for train, dev, and test.
- Return type:
Tuple
[SmartCompose
,SmartCompose
,SmartCompose
]- Returns:
Tuple of augmentation pipelines for train, dev, and test.
The AugmentationPipeline
class is used to define the configuration and instantiate the augmentation pipeline.
- class autrainer.augmentations.AugmentationPipeline(pipeline, generator_seed=0)[source]#
Initialize an augmentation pipeline.
- Parameters:
pipeline (
List
[Union
[str
,Dict
[str
,Any
]]]) – The list of augmentations to apply.generator_seed (
int
) – Seed to pass to each augmentation for reproducibility if the augmentation does not have a seed. Defaults to 0.
Abstract Augmentation#
- class autrainer.augmentations.AbstractAugmentation(order=0, p=1.0, generator_seed=None, **kwargs)[source]#
Abstract class for an augmentation.
- Parameters:
order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.kwargs – Additional keyword arguments to store in the object.
- Raises:
ValueError – If p is not in the range [0, 1].
- __call__(x, index=None)[source]#
Call the augmentation apply method with probability p.
- Parameters:
x (
Tensor
) – The input tensor.index (
Optional
[int
]) – The index of the input tensor in the dataset. Defaults to None.
- Return type:
Tensor
- Returns:
The augmented tensor if the probability is less than p, otherwise the input tensor.
- abstract apply(x, index=None)[source]#
Apply the augmentation to the input tensor.
Apply is called with probability p.
- Parameters:
x (
Tensor
) – The input tensor.index (
Optional
[int
]) – The index of the input tensor in the dataset. Defaults to None.
- Return type:
Tensor
- Returns:
The augmented tensor.
Augmentation Wrappers#
For easier access to common augmentation libraries, autrainer provides wrappers for torchaudio, torchvision, torch-audiomentations, and albumentations augmentations.
The underlying augmentation is specified with the name
attribute, representing the class name of the augmentation.
Any further attributes are passed as keyword arguments to the augmentation constructor.
Note
For each augmentation, the probability p
of applying the augmentation is always available, if the underlying augmentation supports it.
If not specified, the default value is 1.0, overriding any existing default value of the library.
Both torch-audiomentations and albumentations augmentations are optional and can be installed using the following commands.
pip install autrainer[albumentations]
pip install autrainer[torch-audiomentations]
- class autrainer.augmentations.TorchaudioAugmentation(name, order=0, p=1.0, generator_seed=None, **kwargs)[source]#
Wrapper around torchaudio.transforms transforms, which are specified by their class name and keyword arguments.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
- Parameters:
name (
str
) – Name of the torchaudio augmentation. Must be a valid torchaudio.transforms transform class name.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.kwargs – Keyword arguments passed to the torchaudio augmentation.
Default Configurations
No default configurations are provided for torchaudio augmentations. To discover the available torchaudio augmentations, refer to the torchaudio documentation.
- class autrainer.augmentations.TorchvisionAugmentation(name, order=0, p=1.0, generator_seed=None, **kwargs)[source]#
Wrapper around torchvision.transforms.v2 transforms, which are specified by their class name and keyword arguments.
Functionals are currently not supported.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
- Parameters:
name (
str
) – Name of the torchvision augmentation. Must be a valid torchvision.transforms.v2 transform class name.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.kwargs – Keyword arguments passed to the torchvision augmentation.
Default Configurations
AugMix
1id: AugMix 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TorchvisionAugmentation: 8 name: AugMix
GaussianBlur
1id: GaussianBlur 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TorchvisionAugmentation: 8 name: GaussianBlur 9 kernel_size: 3 10 sigma: [0.1, 2]
RandAugment
1id: RandAugment-light 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TorchvisionAugmentation: 8 name: RandAugment 9 num_ops: 2 10 magnitude: 0
1id: RandAugment-medium 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TorchvisionAugmentation: 8 name: RandAugment 9 num_ops: 2 10 magnitude: 15
1id: RandAugment-strong 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TorchvisionAugmentation: 8 name: RandAugment 9 num_ops: 2 10 magnitude: 20
RandGrayscale
1id: RandGrayscale 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TorchvisionAugmentation: 8 name: RandomGrayscale 9 p: 0.5
- class autrainer.augmentations.AudiomentationsAugmentation(name, sample_rate=None, order=0, p=1.0, generator_seed=None, **kwargs)[source]#
Wrapper around audiomentations transforms, which are specified by their class name and keyword arguments.
Audiomentations operates on numpy arrays, so the input tensor is converted to a numpy array before applying the augmentation, and the output numpy array is converted back to a tensor.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
- Parameters:
name (
str
) – Name of the torchaudio augmentation. Must be a valid audiomentations transform class name.sample_rate (
Optional
[int
]) – The sample rate of the audio data. Should be specified for most audio augmentations. If None, the sample rate is not passed to the augmentation. Defaults to None.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.kwargs – Keyword arguments passed to the audiomentations augmentation.
Default Configurations
No default configurations are provided for audiomentations augmentations. To discover the available audiomentations augmentations, refer to the audiomentations documentation.
- class autrainer.augmentations.TorchAudiomentationsAugmentation(name, sample_rate, order=0, p=1.0, generator_seed=None, **kwargs)[source]#
Wrapper around torch_audiomentations transforms, which are specified by their class name and keyword arguments.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
- Parameters:
name (
str
) – Name of the torchaudio augmentation. Must be a valid torch_audiomentations transform class name.sample_rate (
int
) – The sample rate of the audio data.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.kwargs – Keyword arguments passed to the torch_audiomentations augmentation.
Default Configurations
No default configurations are provided for torch-audiomentations augmentations. To discover the available torch-audiomentations augmentations, refer to the torch-audiomentations documentation.
- class autrainer.augmentations.AlbumentationsAugmentation(name, order=0, p=1.0, generator_seed=None, **kwargs)[source]#
Wrapper around albumentations transforms, which are specified by their class name and keyword arguments.
Albumentations operates on numpy arrays, so the input tensor is converted to a numpy array before applying the augmentation, and the output numpy array is converted back to a tensor.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
- Parameters:
name (
str
) – Name of the albumentations augmentation. Must be a valid albumentations transform class name.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.kwargs – Keyword arguments passed to the albumentations augmentation.
Default Configurations
No default configurations are provided for albumentations augmentations. To discover the available albumentations augmentations, refer to the albumentations documentation.
Augmentation Graphs#
To create more complex augmentation pipelines which may resemble a graph structure, Sequential
and Choice
can be used.
Tip
To create custom augmentation graphs, refer to the custom augmentation graphs tutorial.
- class autrainer.augmentations.Sequential(sequence, order=0, p=1.0, generator_seed=None)[source]#
Create a fixed sequence of augmentations.
The order of the augmentations in the list is not considered and is placed with respect to the order of the sequence augmentation itself. This means that the sequence of augmentations is applied in the order they are defined in the list and not disrupted by any other transform.
Augmentations in the list must not have a collate function.
- Parameters:
sequence (
List
[Dict
]) – A list of (shorthand syntax) dictionaries defining the augmentation sequence.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
- class autrainer.augmentations.Choice(choices, weights=None, order=0, p=1.0, generator_seed=None)[source]#
Choose one augmentation from a list of augmentations with a given probability.
The order of the augmentations in the list is not considered and is placed with respect to the order of the choice augmentation itself.
Augmentations in the list must not have a collate function.
- Parameters:
choices (
List
[Dict
]) – A list of (shorthand syntax) dictionaries defining the augmentations to choose from.weights (
Optional
[List
[float
]]) – A list of weights for each choice. If None, all augmentations are assigned equal weights. Defaults to None.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
- Raises:
ValueError – If choices and weights have different lengths.
ValueError – If any augmentation has a collate function.
Note
The order of Sequential
and Choice
can be defined in the configuration file by the order
attribute.
However, order attributes of the augmentations within the Sequential
and Choice
are ignored.
As the augmentations are applied in a scoped manner, their order is determined by the order of the augmentations in the configuration file.
Spectrogram Augmentations#
- class autrainer.augmentations.GaussianNoise(mean=0.0, std=1.0, order=0, p=1.0, generator_seed=None)[source]#
Add Gaussian noise to the input tensor with mean and standard deviation.
- Parameters:
mean (
float
) – The mean of the Gaussian noise. Defaults to 0.0.std (
float
) – The standard deviation of the Gaussian noise. Defaults to 1.0.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: GaussianNoise 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.GaussianNoise: 8 mean: 0.0 9 std: 0.001
- class autrainer.augmentations.TimeMask(time_mask, axis, replace_with_zero=True, order=0, p=1.0, generator_seed=None)[source]#
Mask a random number of time steps.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
- Parameters:
time_mask (
int
) – maximum time steps in a tensor will be masked.axis (
int
) – Time axis. If the image is torch Tensor, it is expected to have [C, H, W] shape, then H is assumed to be axis 0, and W is axis 1.replace_with_zero (
bool
) – Fill the mask either with a tensor mean, or 0’s. Defaults to True.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: TimeMask 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TimeMask: 8 time_mask: 80 9 axis: 0
- class autrainer.augmentations.FrequencyMask(freq_mask, axis, replace_with_zero=True, order=0, p=1.0, generator_seed=None)[source]#
Mask a random number of frequency steps.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
- Parameters:
freq_mask (
int
) – maximum frequency steps in a tensor will be masked.axis (
int
) – Frequency axis. If the image is torch Tensor, it is expected to have [C, H, W] shape, then H is assumed to be axis 0, and W is axis 1.replace_with_zero (
bool
) – Fill the mask either with a tensor mean, or 0’s. Defaults to True.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: FrequencyMask 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.FrequencyMask: 8 freq_mask: 10 9 axis: 1
- class autrainer.augmentations.TimeShift(axis, time_steps=0, order=0, p=1.0, generator_seed=None)[source]#
Shift the input tensor along the time axis.
- Parameters:
axis (
int
) – Time axis. If the image is torch Tensor, it is expected to have [C, H, W] shape, then H is assumed to be axis 0, and W is axis 1.time_steps (
int
) – maximum time steps a tensor will shifted forward or backward. Defaults to 0.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: TimeShift 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TimeShift: 8 time_steps: 20 9 axis: 0
- class autrainer.augmentations.TimeWarp(axis, W=10, order=0, p=1.0, generator_seed=None)[source]#
A random point along the time axis passing through the center of the image within the time steps (W, tau - W) is to be warped either to the left or right by a distance w chosen from a uniform distribution from 0 to the time warp parameter W along that line.
- Parameters:
axis (
int
) – Time axis. If the image is torch Tensor, it is expected to have [C, H, W] shape, then H is assumed to be axis 0, and W is axis 1.W (
int
) – Bound for squishing/stretching. Defaults to 10.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: TimeWarp 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.TimeWarp: 8 W: 30 9 axis: 0
- class autrainer.augmentations.SpecAugment(time_mask=10, freq_mask=10, W=50, order=0, p=1.0, generator_seed=None)[source]#
SpecAugment augmentation. A combination of time warp, frequency masking, and time masking.
Important: While the probability of applying the augmentation is deterministic if the generator_seed is set, the actual augmentation applied is not deterministic. This is because the internal random number generator of the augmentation is not seeded.
For more information, see: https://arxiv.org/abs/1904.08779
This implementation differs from PyTorch, as they apply TimeStrech instead of TimeWarp. For more information, see: https://pytorch.org/audio/master/tutorials/audio_feature_augmentation_tutorial.html#specaugment
- Parameters:
time_mask (
int
) – maximum time steps in a tensor will be masked. Defaults to 10.freq_mask (
int
) – maximum frequency steps in a tensor will be masked. Defaults to 10.W (
int
) – Bound for squishing/stretching the time axis. Defaults to 50.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: SpecAugment 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.SpecAugment: 8 freq_mask: 10 9 time_mask: 80 10 W: 30 11 p: 0.5
Augmentations with Collate Functions#
For augmentations that require a collate function, an optional get_collate_fn
method can be implemented.
This method is used to retrieve the collate function from the augmentation if it is present.
Tip
To create custom augmentations with collate functions, refer to the custom augmentations tutorial.
The signature of the get_collate_fn
method should be as follows:
1class ExampleCollateAugmentation(AbstractAugmentation):
2 def get_collate_fn(self, data: "AbstractDataset") -> Callable:
3 return torch.utils.data.default_collate
Note
Only one collate function can be used in each transform pipeline. If multiple collate functions are defined, the last one in the pipeline (defined by the order of the transforms) is used.
Both CutMix
and MixUp
augmentations require a collate function and operate on the batch level.
This means, that the collate function is applied to the batch of samples, rather than individual samples,
and the probability of applying the augmentation acts on the batch level as well.
- class autrainer.augmentations.CutMix(alpha=1.0, order=0, p=1.0, generator_seed=None)[source]#
CutMix augmentation. As CutMix utilizes a collate function, the probability of applying the augmentation is drawn for each batch.
- Parameters:
alpha (
float
) – Hyperparameter of the Beta distribution. Defaults to 1.0.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: CutMix 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.CutMix: 8 alpha: 0.8
- class autrainer.augmentations.MixUp(alpha=1.0, order=0, p=1.0, generator_seed=None)[source]#
MixUp augmentation. As MixUp utilizes a collate function, the probability of applying the augmentation is drawn for each batch.
- Parameters:
alpha (
float
) – Hyperparameter of the Beta distribution. Defaults to 1.0.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 0.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: MixUp 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.MixUp: 8 alpha: 0.8
Miscellaneous Augmentations#
- class autrainer.augmentations.SampleGaussianWhiteNoise(snr_df, snr_col, sample_seed=None, order=101, p=1.0, generator_seed=None)[source]#
Sample-level gaussian white noise augmentation based on SNR values.
- Parameters:
snr_df (
str
) – Path to a CSV file containing SNR values for each sample. Index of the CSV file must match the index of the dataset.snr_col (
str
) – Name of the column containing the SNR values.sample_seed (
Optional
[int
]) – Seed for the random number generator used for sampling the noise. If a seed is provided, a consistent augmentation is applied to the same sample. Defaults to None.order (
int
) – The order of the augmentation in the transformation pipeline. Defaults to 101.p (
float
) – The probability of applying the augmentation. Defaults to 1.0.generator_seed (
Optional
[int
]) – The initial seed for the internal random number generator drawing the probability. If None, the generator is not seeded. Defaults to None.
Default Configurations
1id: SampleGaussianWhiteNoise 2_target_: autrainer.augmentations.AugmentationPipeline 3 4generator_seed: 0 5 6pipeline: 7 - autrainer.augmentations.SampleGaussianWhiteNoise: 8 snr_df: ??? 9 snr_col: ??? 10 sample_seed: ???