Hydra Configurations#

autrainer uses Hydra to configure training experiments. All configurations are stored in the conf/ directory and are defined as YAML files.

Main Configuration#

The main entry point of autrainer is the conf/config.yaml file by default. This file defines the configuration of the training experiments over which a grid search is performed.

conf/config.yaml#
 1defaults:
 2  - _autrainer_
 3  - _self_
 4
 5results_dir: results
 6experiment_id: default
 7iterations: 5
 8
 9hydra:
10  sweeper:
11    params:
12      +seed: 1
13      +batch_size: 32
14      +learning_rate: 0.001
15      dataset: ToyTabular-C
16      model: ToyFFNN
17      optimizer: Adam

The main configuration file defines the following parameters:

  • defaults: A list of default configurations (see Defaults List, Optional Defaults and Overrides, and autrainer Defaults).

  • results_dir: The directory where the results of the training experiments are stored.

  • experiment_id: The ID of the experiment.

  • hydra/sweeper/params: The parameters of the sweep.

    • <attr>: An attribute to sweep over from the Defaults List (with comma-separated values, e.g. a list of models).

    • +<attr>: An attribute to sweep over that is not in the Defaults List (with comma-separated values, e.g. a list of batch sizes).

  • hydra/sweeper/filters: A list of filters to filter out unwanted hyperparameter combinations (see Sweeper Plugins).

Some parameters of the main configuration file are outsourced to the _autrainer_.yaml defaults file in order to simplify the configuration.

For more information on configuring Hydra, see the Hydra documentation.

Tip

To use a different configuration file as the entry point for training experiments, use the -cn/–config-name argument for the autrainer train CLI command:

autrainer train -cn some_other_config.yaml

Alternatively, use the config_name parameter for the train() CLI wrapper function (without the file extension):

autrainer.cli.train(config_name="some_other_config")

For more information on command line flags, see Hydra’s command line flags documentation.

Configuration Directories#

Configuration files imported through the Defaults List are stored in the conf/ directory. The configuration files are organized in subdirectories (e.g. conf/dataset/, conf/model/, conf/optimizer/, etc.). This directory structure tells Hydra where to look for the configuration files. The following configuration subdirectories are available:

  • conf/augmentation/

  • conf/dataset/

  • conf/model/

  • conf/optimizer/

  • conf/plotting

  • conf/preprocessing/

  • conf/scheduler/

Creating Configurations#

autrainer provides a number of default configurations for models, datasets, optimizers, etc. that can be used out of the box without creating custom configurations. To use a default configuration e.g. a MobileNetV3-Large-T model, add it to the conf/config.yaml file:

conf/config.yaml#
1...
2hydra
3  sweeper:
4    params:
5      ...
6      model: MobileNetV3-Large-T

Tip

To discover configurations that are available by default, use the autrainer list CLI command or the list() CLI wrapper function. For example, to discover all available MobileNet configurations, use the following command with a glob pattern:

autrainer list model --pattern="MobileNet*"

Alternatively, use the list() CLI wrapper function with the pattern parameter:

autrainer.cli.list(directory="model", pattern="MobileNet*")

To create a new configuration, create a new YAML file in the appropriate configuration subdirectory. Every configuration file should have:

  • A unique file name.

  • A unique id attribute that matches the file name.

  • A _target_ attribute that specifies the python import path of the class to be instantiated.

  • Optional attributes that are passed to the class constructor as keyword arguments.

For example, to create a new model configuration, create a new YAML file in the conf/model/ directory:

conf/model/MobileNetV3-Large-T.yaml#
1id: MobileNetV3-Large-T
2_target_: autrainer.models.TorchvisionModel
3torchvision_name: mobilenet_v3_large
4transfer: true
5transform:
6  type: image

For more information on how to create custom models, see Models.

Tip

To modify an existing configuration provided by autrainer, create a new YAML file with the name in the subdirectory. The new configuration will override the existing configuration.

To easily override a configuration, save a local copy of it using the autrainer show command:

autrainer show model "MobileNetV3-Large-T" --save

Alternatively, use the show() CLI wrapper function with the save parameter:

autrainer.cli.show(directory="model", id="MobileNetV3-Large-T", save=True)

Shorthand Syntax#

For configurations that are not primitive types (e.g. numbers or strings) and are not included in the Defaults List, shorthand syntax is used to reduce the number of configuration files required. Instead of a configuration file, shorthand syntax configurations can be either a string or a dictionary.

  • If the configurations is a string, it is interpreted as the python import path of the class to be instantiated.

  • If the configuration is a dictionary, the key is interpreted as the python import path of the class to be instantiated, and its values are passed as keyword arguments to the class constructor.

For example, in a dataset configuration, the tracking_metric and the list of metrics are specified with shorthand syntax:

conf/dataset/ExampleDataset.yaml#
 1id: ExampleDataset
 2_target_: example_dataset.ExampleDataset
 3...
 4
 5tracking_metric: autrainer.metrics.Accuracy
 6metrics:
 7  - autrainer.metrics.Accuracy
 8  - autrainer.metrics.UAR
 9  - autrainer.metrics.F1
10  - custom_metric.CustomMetric:
11      param1: 1
12      param2: 2

Shorthand syntax is used to specify the pipeline of Augmentations, the Loggers, the Metrics for the dataset, and the model as well as Transforms.

Interpolation Syntax#

autrainer supports OmegaConf variable interpolation to reference attributes from anywhere in the configuration.

For example, the custom loggers tutorial uses the OmegaConf interpolation syntax to reference the results_dir from the main configuration file to set the output directory of the WandBLogger:

conf/config.yaml#
1...
2loggers:
3  - wandb_logger.WandBLogger:
4      output_dir: ${results_dir}/.wandb
5...

For more information on variable interpolation, refer to the OmegaConf documentation.

Defaults List#

The defaults list instructs Hydra to import configurations from other YAML files to build the final configuration. For more information on the defaults list, see Hydra’s defaults list documentation.

For brevity, the defaults list is outsourced to _autrainer_.yaml defaults file with the following imports:

_autrainer_.yaml#
1defaults:
2  - _self_
3  - dataset: ??? # placeholder
4  - model: ??? # placeholder
5  - optimizer: ??? # placeholder
6  - scheduler: None # optional default
7  - augmentation: None # optional default
8  - plotting: Default # optional default

Optional Defaults and Overrides#

Optional defaults like scheduler and augmentation are set to None by default and not required to be defined in the main configuration. None is a special value that tells Hydra to ignore the default and e.g. not use a scheduler or augmentation for training.

Optional defaults which are not overridden in the hydra/sweeper/params configuration, can be overridden using the override keyword in the defaults list of the main configuration file. This includes the plotting configuration or Hydra plugins like sweepers and launchers.

For more information on overriding defaults, refer to the Hydra overriding documentation.

autrainer Defaults#

The _autrainer_.yaml file contains further default configurations to simplify the main configuration, which includes:

  • The defaults list and optional defaults list.

  • Global default parameters for training, such as the evaluation frequency, save frequency, inference batch size, CUDA-enabled device, etc.

  • Hydra configurations for always starting a Hydra multirun (grid search) and setting the output directory and experiment name according to the current configuration.

Tip

Any global default parameter can be overridden in the main configuration file by redefining it.

conf/_autrainer_.yaml#
 1defaults:
 2  - _self_
 3  - dataset: ???
 4  - model: ???
 5  - optimizer: ???
 6  - scheduler: None
 7  - augmentation: None
 8  - plotting: Default
 9  - override hydra/sweeper: autrainer_filter_sweeper
10
11training_type: epoch
12eval_frequency: 1
13save_frequency: ${eval_frequency}
14inference_batch_size: ${batch_size}
15device: cuda:0
16
17progress_bar: true
18continue_training: true
19remove_continued_runs: true
20save_train_outputs: true
21save_dev_outputs: true
22save_test_outputs: true
23
24hydra:
25  output_subdir: null
26  mode: MULTIRUN
27  sweep:
28    dir: ${results_dir}/${experiment_id}/training/
29    subdir: "\
30      ${dataset.id}_\
31      ${model.id}_\
32      ${optimizer.id}_\
33      ${learning_rate}_\
34      ${batch_size}_\
35      ${training_type}_\
36      ${iterations}_\
37      ${scheduler.id}_\
38      ${augmentation.id}_\
39      ${seed}"

Hydra Plugings#

Any Hydra sweeper or launcher plugin can be used to customize the hyperparameter search or parallelize the training jobs.

Tip

Sweepers and launchers can be combined to perform more complex hyperparameter optimizations and parallelize training jobs.

Sweeper Plugins#

By default, autrainer uses the hydra-filter-sweeper plugin to sweep over hyperparameter configurations. This plugin allows to specify a list of filters in the configuration file to filter out unwanted hyperparameter combinations.

Tip

To specify custom filters, refer to the filtering out configurations quickstart guide and the hydra-filter-sweeper documentation.

Note

If no filters are specified, the plugin will not filter out any configurations and resemble the behavior of the default Hydra basic sweeper plugin.

To perform more complex hyperparameter sweeps, different sweeper plugins can be used.

For example, the Hydra Optuna Sweeper plugin can be used to perform hyperparameter optimization using Optuna.

To install the Optuna Sweeper plugin, run the following command:

pip install hydra-optuna-sweeper

The following configuration uses the Optuna Sweeper plugin to perform 10 trials with different learning rates:

conf/config.yaml#
 1defaults:
 2  - _autrainer_
 3  - _self_
 4  - override hydra/sweeper: optuna # override the sweeper to optuna
 5
 6results_dir: results
 7experiment_id: default
 8iterations: 5
 9
10hydra:
11  sweeper:
12    n_trials: 10 # total number of function evaluations
13    n_jobs: 10 # number of parallel jobs
14    direction: maximize # direction of optimization (depending on the tracking metric)
15    params:
16      +seed: 1
17      +batch_size: 32
18      +learning_rate: range(0.001, 0.1, step=0.001) # range of values
19      dataset: ToyTabular-C
20      model: ToyFFNN
21      optimizer: Adam

Note

autrainer returns the best validation value of the dataset tracking metric as the objective for optimization. The optimization direction attribute in the configuration file can be set to minimize or maximize according to the dataset tracking metric.

Launcher Plugins#

By default, autrainer uses the Hydra basic launcher to sequentially launch the jobs defined in the configuration file.

To parallelize the training jobs, different launcher plugins can be used.

For example, the Hydra Submitit Launcher plugin can be used to parallelize the training jobs using Submitit.

To install the Submitit Launcher plugin, run the following command:

pip install hydra-submitit-launcher

The following configuration uses the Submitit Launcher plugin to parallelize the training jobs:

conf/config.yaml#
 1defaults:
 2  - _autrainer_
 3  - _self_
 4  - override hydra/launcher: submitit_slurm # override the launcher to submitit_slurm
 5
 6results_dir: results
 7experiment_id: default
 8iterations: 5
 9
10hydra:
11  sweeper:
12    params:
13      +seed: 1
14      +batch_size: 32
15      +learning_rate: 0.001
16      dataset: ToyTabular-C
17      model: ToyFFNN
18      optimizer: Adam