Hydra Configurations#
Analogous to autrainer,
aucurriculum uses Hydra to configure training experiments.
All configurations are stored in the conf/
directory and are defined as YAML files.
Main Configuration#
aucurriculum extends the autrainer main entry point for curriculum training and adds an additional curriculum scoring entry point for obtaining difficulty orderings from one or more scoring functions.
Curriculum Training#
The main entry point for curriculum training is defined in the conf/config.yaml
file.
1defaults:
2 - _aucurriculum_train_
3 - _self_
4
5results_dir: results
6experiment_id: default
7iterations: 5
8
9hydra:
10 sweeper:
11 params:
12 +seed: 1
13 +batch_size: 32
14 +learning_rate: 0.001
15 dataset: ToyTabular-C
16 model: ToyFFNN
17 optimizer: Adam
18 curriculum: None
19 curriculum/sampling: None
20 curriculum/scoring: None
21 curriculum/pacing: None
22 curriculum.pacing.initial_size: 1
23 curriculum.pacing.final_iteration: 0
In addition to the autrainer main configuration file
the following hypdra/sweeper/params
parameters are defined:
curriculum
: The curriculum configuration. The following options are available:None
: No curriculum is used and training is performed on the full dataset. All other curriculum parameters are ignored.Curriculum
: A curriculum is used to train the model. New samples are introduced in order of increasing difficulty.AntiCurriculum
: An anti-curriculum is used to train the model. New samples are introduced in order of decreasing difficulty.
curriculum/sampling
: The sampling strategy of introducing new samples into the training set. The following strategies are available:Unbalanced
: Introduce new samples directly following the sample difficulty ordering provided by the scoring function.Balanced
: Introduce new samples in a class-balanced manner. New samples are introduced in a way that the class distribution of the training set remains balanced as long as possible, conceptually applying the difficulty ordering to each class separately.Original
: Introduce new samples with the same distribution as the original dataset. This is equivalent to introducing new samples in a class-balanced manner, but with the class distribution of the original dataset.
curriculum/scoring
: The ID of an already computed scoring function using the curriculum scoring min configuration file.curriculum/pacing
: The ID of a pacing function.curriculum.pacing.initial_size
: The fraction of the dataset to be used in the first iteration. The remaining fraction is gradually introduced in subsequent iterations.curriculum.pacing.final_iteration
: The fraction of iterations until the full dataset is used for training.
For each parameter, one or more values can be specified to sweep over different configurations.
If any of curriculum
, curriculum/sampling
, curriculum/scoring
, or curriculum/pacing
are set to None, the run is filtered out and not executed.
Curriculum Scoring#
The main entry point for curriculum scoring is defined in the conf/curriculum.yaml
file.
1defaults:
2 - _aucurriculum_score_
3 - _self_
4
5results_dir: results
6experiment_id: default
7
8hydra:
9 sweeper:
10 params:
11 curriculum/scoring: None
12
13correlation:
14 correlation_matrix: all
The following parameters are defined (alongside common attributes such as the results_dir
or experiment_id
):
curriculum/scoring
(underhypdra/sweeper/params
): The scoring function ID to be calculated.correlation
: A mapping of matrix names and list of scoring function IDs to calculate the correlation between scoring functions.all
serves as a placeholder to include all scoring functions in the correlation matrix.
Some parameters of the main configuration file are outsourced to the _aucurriculum_train_.yaml defaults and _aucurriculum_score_.yaml defaults files in order to simplify the configurations.
For more information on configuring Hydra, see the Hydra documentation.
Tip
Analogous to autrainer configurations,
different files can be used as the main entry point for training and scoring experiments using
the -cn/--config-name
argument for the aucurriculum train CLI command:
aucurriculum train -cn some_other_config
Alternatively, use the config_name
parameter for the train()
CLI wrapper function:
aucurriculum.cli.train(config_name="some_other_config")
For more information on command line flags, see Hydra’s command line flags documentation.
Configuration Directories#
In addition to the autrainer configuration directories, the following configuration subdirectories are available:
conf/curriculum/
conf/curriculum/sampling/
conf/curriculum/scoring/
conf/curriculum/pacing/
Configurations#
Analogous to autrainer, aucurriculum provides default configurations for scoring functions, pacing functions, etc. that can be used out of the box without creating custom configurations.
For more information on creating, discovering, and using configurations, or special syntaxes and overriding defaults, refer to the autrainer configuration creation documentation.
aucurriculum Defaults#
Both the _aucurriculum_train_.yaml
and _aucurriculum_score_.yaml
files contain further default configurations
to simplify the main configuration files.
Tip
Any global default parameter can be overridden in the main configuration file by redefining it.
1defaults:
2 - _self_
3 - dataset: ???
4 - model: ???
5 - optimizer: ???
6 - scheduler: None
7 - augmentation: None
8 - curriculum: None
9 - plotting: Default
10 - override hydra/sweeper: aucurriculum_train_sweeper
11
12training_type: epoch
13eval_frequency: 1
14save_frequency: ${eval_frequency}
15inference_batch_size: ${batch_size}
16device: cuda:0
17
18progress_bar: true
19continue_training: true
20remove_continued_runs: true
21save_train_outputs: true
22save_dev_outputs: true
23save_test_outputs: true
24
25hydra:
26 output_subdir: null
27 mode: MULTIRUN
28 sweep:
29 dir: ${results_dir}/${experiment_id}/training/
30 subdir: "\
31 ${dataset.id}_\
32 ${model.id}_\
33 ${optimizer.id}_\
34 ${learning_rate}_\
35 ${batch_size}_\
36 ${training_type}_\
37 ${iterations}_\
38 ${scheduler.id}_\
39 ${augmentation.id}_\
40 ${curriculum.short}_\
41 ${curriculum.sampling.short}_\
42 ${curriculum.scoring.id}_\
43 ${curriculum.pacing.id}_\
44 ${curriculum.pacing.initial_size}_\
45 ${curriculum.pacing.final_iteration}_\
46 ${seed}"
1defaults:
2 - _self_
3 - curriculum/scoring: ???
4 - plotting: Default
5 - override hydra/sweeper: aucurriculum_score_sweeper
6
7batch_size: 32
8progress_bar: true
9device: cuda:0
10
11hydra:
12 mode: MULTIRUN
13 sweep:
14 dir: ${results_dir}/${experiment_id}/curriculum/
15 subdir: ${curriculum.scoring.type}
Hydra Plugins#
By default, aucurriculum uses the hydra-filter-sweeper
plugin to sweep over hyperparameter configurations which are defined in the aucurriculum defaults
and implemented in the core module.
This plugin allows to specify a list of filters
in the configuration file to filter out unwanted hyperparameter combinations.
To perform more complex hyperparameter sweeps, different sweeper or launcher plugins can be used. For more information on plugins, refer to the autrainer plugins documentation.
Note
Custom sweeper and launcher plugins are not yet supported for curriculum scoring.