Hydra Configurations#

Analogous to autrainer, aucurriculum uses Hydra to configure training experiments. All configurations are stored in the conf/ directory and are defined as YAML files.

Main Configuration#

aucurriculum extends the autrainer main entry point for curriculum training and adds an additional curriculum scoring entry point for obtaining difficulty orderings from one or more scoring functions.

Curriculum Training#

The main entry point for curriculum training is defined in the conf/config.yaml file.

conf/config.yaml#

defaults:
  - _aucurriculum_train_
  - _self_

results_dir: results
experiment_id: default
iterations: 5

hydra:
  sweeper:
    params:
      +seed: 1
      +batch_size: 32
      +learning_rate: 0.001
      dataset: ToyTabular-C
      model: ToyFFNN
      optimizer: Adam
      curriculum: None
      curriculum/sampling: None
      curriculum/scoring: None
      curriculum/pacing: None
      curriculum.pacing.initial_size: 1
      curriculum.pacing.final_iteration: 0

In addition to the autrainer main configuration file the following hypdra/sweeper/params parameters are defined:

curriculum: The curriculum configuration. The following options are available:
- None: No curriculum is used and training is performed on the full dataset. All other curriculum parameters are ignored.
- Curriculum: A curriculum is used to train the model. New samples are introduced in order of increasing difficulty.
- AntiCurriculum: An anti-curriculum is used to train the model. New samples are introduced in order of decreasing difficulty.
curriculum/sampling: The sampling strategy of introducing new samples into the training set. The following strategies are available:
- Unbalanced: Introduce new samples directly following the sample difficulty ordering provided by the scoring function.
- Balanced: Introduce new samples in a class-balanced manner. New samples are introduced in a way that the class distribution of the training set remains balanced as long as possible, conceptually applying the difficulty ordering to each class separately.
- Original: Introduce new samples with the same distribution as the original dataset. This is equivalent to introducing new samples in a class-balanced manner, but with the class distribution of the original dataset.
curriculum/scoring: The ID of an already computed scoring function using the curriculum scoring min configuration file.
curriculum/pacing: The ID of a pacing function.
curriculum.pacing.initial_size: The fraction of the dataset to be used in the first iteration. The remaining fraction is gradually introduced in subsequent iterations.
curriculum.pacing.final_iteration: The fraction of iterations until the full dataset is used for training.

For each parameter, one or more values can be specified to sweep over different configurations.

If any of curriculum, curriculum/sampling, curriculum/scoring, or curriculum/pacing are set to None, the run is filtered out and not executed.

Curriculum Scoring#

The main entry point for curriculum scoring is defined in the conf/curriculum.yaml file.

conf/curriculum.yaml#

defaults:
  - _aucurriculum_score_
  - _self_

results_dir: results
experiment_id: default

hydra:
  sweeper:
    params:
      curriculum/scoring: None

correlation:
  correlation_matrix: all

The following parameters are defined (alongside common attributes such as the results_dir or experiment_id):

curriculum/scoring (under hypdra/sweeper/params): The scoring function ID to be calculated.
correlation: A mapping of matrix names and list of scoring function IDs to calculate the correlation between scoring functions. all serves as a placeholder to include all scoring functions in the correlation matrix.

Some parameters of the main configuration file are outsourced to the _aucurriculum_train_.yaml defaults and _aucurriculum_score_.yaml defaults files in order to simplify the configurations.

For more information on configuring Hydra, see the Hydra documentation.

Tip

Analogous to autrainer configurations, different files can be used as the main entry point for training and scoring experiments using the -cn/--config-name argument for the aucurriculum train CLI command:

aucurriculum train -cn some_other_config

Alternatively, use the config_name parameter for the train() CLI wrapper function:

aucurriculum.cli.train(config_name="some_other_config")

For more information on command line flags, see Hydra’s command line flags documentation.

Configuration Directories#

In addition to the autrainer configuration directories, the following configuration subdirectories are available:

conf/curriculum/
conf/curriculum/sampling/
conf/curriculum/scoring/
conf/curriculum/pacing/

Configurations#

Analogous to autrainer, aucurriculum provides default configurations for scoring functions, pacing functions, etc. that can be used out of the box without creating custom configurations.

For more information on creating, discovering, and using configurations, or special syntaxes and overriding defaults, refer to the autrainer configuration creation documentation.

aucurriculum Defaults#

Both the _aucurriculum_train_.yaml and _aucurriculum_score_.yaml files contain further default configurations to simplify the main configuration files.

Tip

Any global default parameter can be overridden in the main configuration file by redefining it.

conf/_aucurriculum_train_.yaml#

defaults:
  - _self_
  - dataset: ???
  - model: ???
  - optimizer: ???
  - scheduler: None
  - augmentation: None
  - curriculum: None
  - plotting: Default
  - override hydra/sweeper: aucurriculum_train_sweeper

training_type: epoch
eval_frequency: 1
save_frequency: ${eval_frequency}
inference_batch_size: ${batch_size}
device: cuda:0

progress_bar: true
continue_training: true
remove_continued_runs: true
save_train_outputs: true
save_dev_outputs: true
save_test_outputs: true

hydra:
  output_subdir: null
  mode: MULTIRUN
  sweep:
    dir: ${results_dir}/${experiment_id}/training/
    subdir: "\
      ${dataset.id}_\
      ${model.id}_\
      ${optimizer.id}_\
      ${learning_rate}_\
      ${batch_size}_\
      ${training_type}_\
      ${iterations}_\
      ${scheduler.id}_\
      ${augmentation.id}_\
      ${curriculum.short}_\
      ${curriculum.sampling.short}_\
      ${curriculum.scoring.id}_\
      ${curriculum.pacing.id}_\
      ${curriculum.pacing.initial_size}_\
      ${curriculum.pacing.final_iteration}_\
      ${seed}"

conf/_aucurriculum_score_.yaml#

defaults:
  - _self_
  - curriculum/scoring: ???
  - plotting: Default
  - override hydra/sweeper: aucurriculum_score_sweeper

batch_size: 32
progress_bar: true
device: cuda:0

hydra:
  mode: MULTIRUN
  sweep:
    dir: ${results_dir}/${experiment_id}/curriculum/
    subdir: ${curriculum.scoring.type}

Hydra Plugins#

By default, aucurriculum uses the hydra-filter-sweeper plugin to sweep over hyperparameter configurations which are defined in the aucurriculum defaults and implemented in the core module. This plugin allows to specify a list of filters in the configuration file to filter out unwanted hyperparameter combinations.

To perform more complex hyperparameter sweeps, different sweeper or launcher plugins can be used. For more information on plugins, refer to the autrainer plugins documentation.

Note

Custom sweeper and launcher plugins are not yet supported for curriculum scoring.

Table of Contents