Pacing Functions#

Pacing functions control the introduction of new samples into the curriculum-based training process by determining the dataset size at each iteration.

Tip

To create custom pacing functions, refer to the custom pacing functions tutorial.

Based on the difficulty ordering provided by a scoring function, new samples are introduced into the training process in ascending (or descending) order of difficulty.

Note

aucurriculum introduces new samples into the training process after all currently available samples have been seen by the model at least once (i.e., once the training loader is exhausted).

While the order of introducing new samples is determined by the scoring function, all currently available samples are shuffled before being introduced into the training process.

Curriculum Pace Manager#

CurriculumPaceManager manages the pacing of the curriculum by dynamically controlling the training loader, shuffling the currently available dataset samples, and introducing new samples according to the curriculum training configuration.

class aucurriculum.curricula.CurriculumPaceManager[source]#
shuffle_indices()[source]#

Shuffle the indices of the dataset based on the current size.

Return type:

List[int]

Returns:

Shuffled indices of the dataset.

property train_loader: DataLoader#

Create a DataLoader for the training dataset which samples based on the current dataset size and difficulty ordering.

Returns:

The DataLoader for the training dataset.

cb_on_loader_exhausted(trainer, iteration)[source]#

Callback to update the dataset size and weight after the loader is exhausted. The dataset size update is postponed to this point to avoid changing the dataset size during an iteration and ensure that each available sample is used for training at least once.

Parameters:
  • trainer (ModularTaskTrainer) – The trainer instance.

  • iteration (int) – Current iteration.

Return type:

None

Abstract Pacing Function#

All pacing functions inherit from the AbstractPace class and implement the get_dataset_size() method determining the size of the dataset to be used at a given iteration.

class aucurriculum.curricula.pacing.AbstractPace(initial_size, final_iteration, total_iterations, dataset_size)[source]#

Abstract class for pacing functions.

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

abstract get_dataset_size(iteration)[source]#

Get the dataset size at a given iteration according to the pacing function.

Parameters:

iteration (int) – The iteration number.

Return type:

int

Returns:

The dataset size at the given iteration.

Continuous Pacing Functions#

Continuous pacing functions introduce new samples continuously, adding more samples after each iteration.

class aucurriculum.curricula.pacing.Exponential(initial_size, final_iteration, total_iterations, dataset_size)[source]#

Exponential pacing function adapted from: https://arxiv.org/abs/2012.03107

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

Default Configurations
conf/curriculum/pacing/Exponential.yaml#
1id: Exponential
2_target_: aucurriculum.curricula.pacing.Exponential
3initial_size: ???
4final_iteration: ???
class aucurriculum.curricula.pacing.Logarithmic(initial_size, final_iteration, total_iterations, dataset_size)[source]#

Logarithmic pacing function adapted from: https://arxiv.org/abs/2012.03107

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

Default Configurations
conf/curriculum/pacing/Logarithmic.yaml#
1id: Logarithmic
2_target_: aucurriculum.curricula.pacing.Logarithmic
3initial_size: ???
4final_iteration: ???
class aucurriculum.curricula.pacing.Linear(initial_size, final_iteration, total_iterations, dataset_size)[source]#

Linear pacing function adapted from: https://arxiv.org/abs/2012.03107

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

Default Configurations
conf/curriculum/pacing/Linear.yaml#
1id: Linear
2_target_: aucurriculum.curricula.pacing.Linear
3initial_size: ???
4final_iteration: ???
class aucurriculum.curricula.pacing.Quadratic(initial_size, final_iteration, total_iterations, dataset_size)[source]#

Quadratic pacing function adapted from: https://arxiv.org/abs/2012.03107

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

Default Configurations
conf/curriculum/pacing/Quadratic.yaml#
1id: Quadratic
2_target_: aucurriculum.curricula.pacing.Quadratic
3initial_size: ???
4final_iteration: ???
class aucurriculum.curricula.pacing.Root(initial_size, final_iteration, total_iterations, dataset_size)[source]#

Root pacing function adapted from: https://arxiv.org/abs/2012.03107

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

Default Configurations
conf/curriculum/pacing/Root.yaml#
1id: Root
2_target_: aucurriculum.curricula.pacing.Root
3initial_size: ???
4final_iteration: ???
class aucurriculum.curricula.pacing.Polynomial(initial_size, final_iteration, total_iterations, dataset_size, degree=1.0)[source]#

Polynomial pacing function adapted from: https://arxiv.org/abs/2012.03107

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

  • degree (float) – The degree of the polynomial. Defaults to 1.0 (linear).

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

Default Configurations

No default configurations are provided for Polynomial.

Discrete Pacing Functions#

Discrete pacing functions introduce new samples at fixed intervals, such as after a set number of iterations.

class aucurriculum.curricula.pacing.OneStep(initial_size, final_iteration, total_iterations, dataset_size)[source]#

(One-)step pacing function adapted from: https://arxiv.org/abs/2012.03107

Note: The formula in the paper is incorrect and has been corrected.

Parameters:
  • initial_size (float) – The initial fraction of the dataset to start training with.

  • final_iteration (float) – The fraction of training iterations at which the dataset size will be the full dataset size.

  • total_iterations (int) – The total number of training iterations.

  • dataset_size (int) – The size of the dataset.

Raises:

ValueError – If the initial size is not in (0, 1] or if the final iteration is not in [0, 1].

Default Configurations
conf/curriculum/pacing/OneStep.yaml#
1id: OneStep
2_target_: aucurriculum.curricula.pacing.OneStep
3initial_size: ???
4final_iteration: ???