Pacing Functions#
Pacing functions control the introduction of new samples into the curriculum-based training process by determining the dataset size at each iteration.
Tip
To create custom pacing functions, refer to the custom pacing functions tutorial.
Based on the difficulty ordering provided by a scoring function, new samples are introduced into the training process in ascending (or descending) order of difficulty.
Note
aucurriculum introduces new samples into the training process after all currently available samples have been seen by the model at least once (i.e., once the training loader is exhausted).
While the order of introducing new samples is determined by the scoring function, all currently available samples are shuffled before being introduced into the training process.
Curriculum Pace Manager#
CurriculumPaceManager
manages the pacing of the curriculum by dynamically controlling the training loader,
shuffling the currently available dataset samples, and introducing new samples according to the
curriculum training configuration.
- class aucurriculum.curricula.CurriculumPaceManager[source]#
- shuffle_indices()[source]#
Shuffle the indices of the dataset based on the current size.
- Return type:
List
[int
]- Returns:
Shuffled indices of the dataset.
- property train_loader: DataLoader#
Create a DataLoader for the training dataset which samples based on the current dataset size and difficulty ordering.
- Returns:
The DataLoader for the training dataset.
- cb_on_loader_exhausted(trainer, iteration)[source]#
Callback to update the dataset size and weight after the loader is exhausted. The dataset size update is postponed to this point to avoid changing the dataset size during an iteration and ensure that each available sample is used for training at least once.
- Parameters:
trainer (
ModularTaskTrainer
) – The trainer instance.iteration (
int
) – Current iteration.
- Return type:
None
Abstract Pacing Function#
All pacing functions inherit from the AbstractPace
class and implement the
get_dataset_size()
method determining the size of the dataset to be used at a given iteration.
- class aucurriculum.curricula.pacing.AbstractPace(initial_size, final_iteration, total_iterations, dataset_size)[source]#
Abstract class for pacing functions.
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Continuous Pacing Functions#
Continuous pacing functions introduce new samples continuously, adding more samples after each iteration.
- class aucurriculum.curricula.pacing.Exponential(initial_size, final_iteration, total_iterations, dataset_size)[source]#
Exponential pacing function adapted from: https://arxiv.org/abs/2012.03107
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Default Configurations
1id: Exponential 2_target_: aucurriculum.curricula.pacing.Exponential 3initial_size: ??? 4final_iteration: ???
- class aucurriculum.curricula.pacing.Logarithmic(initial_size, final_iteration, total_iterations, dataset_size)[source]#
Logarithmic pacing function adapted from: https://arxiv.org/abs/2012.03107
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Default Configurations
1id: Logarithmic 2_target_: aucurriculum.curricula.pacing.Logarithmic 3initial_size: ??? 4final_iteration: ???
- class aucurriculum.curricula.pacing.Linear(initial_size, final_iteration, total_iterations, dataset_size)[source]#
Linear pacing function adapted from: https://arxiv.org/abs/2012.03107
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Default Configurations
1id: Linear 2_target_: aucurriculum.curricula.pacing.Linear 3initial_size: ??? 4final_iteration: ???
- class aucurriculum.curricula.pacing.Quadratic(initial_size, final_iteration, total_iterations, dataset_size)[source]#
Quadratic pacing function adapted from: https://arxiv.org/abs/2012.03107
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Default Configurations
1id: Quadratic 2_target_: aucurriculum.curricula.pacing.Quadratic 3initial_size: ??? 4final_iteration: ???
- class aucurriculum.curricula.pacing.Root(initial_size, final_iteration, total_iterations, dataset_size)[source]#
Root pacing function adapted from: https://arxiv.org/abs/2012.03107
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Default Configurations
1id: Root 2_target_: aucurriculum.curricula.pacing.Root 3initial_size: ??? 4final_iteration: ???
- class aucurriculum.curricula.pacing.Polynomial(initial_size, final_iteration, total_iterations, dataset_size, degree=1.0)[source]#
Polynomial pacing function adapted from: https://arxiv.org/abs/2012.03107
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.degree (
float
) – The degree of the polynomial. Defaults to 1.0 (linear).
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Default Configurations
No default configurations are provided for
Polynomial
.
Discrete Pacing Functions#
Discrete pacing functions introduce new samples at fixed intervals, such as after a set number of iterations.
- class aucurriculum.curricula.pacing.OneStep(initial_size, final_iteration, total_iterations, dataset_size)[source]#
(One-)step pacing function adapted from: https://arxiv.org/abs/2012.03107
Note: The formula in the paper is incorrect and has been corrected.
- Parameters:
initial_size (
float
) – The initial fraction of the dataset to start training with.final_iteration (
float
) – The fraction of training iterations at which the dataset size will be the full dataset size.total_iterations (
int
) – The total number of training iterations.dataset_size (
int
) – The size of the dataset.
- Raises:
ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].
Default Configurations
1id: OneStep 2_target_: aucurriculum.curricula.pacing.OneStep 3initial_size: ??? 4final_iteration: ???