CLI Reference#
autrainer provides a command line interface (CLI) to manage the entire training process, including configuration management, data preprocessing, model training, inference, and postprocessing.
In addition to the CLI, autrainer provides an CLI wrapper to manage configurations, data, training, inference, and postprocessing programmatically with the same functionality as the CLI.
autrainer#
usage: autrainer [-h] [-v] <command> ...
A Modular and Extensible Deep Learning Toolkit for Computer Audition Tasks.
usage: autrainer [-h] [-v] <command> ...
Positional Arguments#
- <command>
Possible choices: create, list, show, fetch, preprocess, train, inference, postprocess, rm-failed, rm-states, group
Named Arguments#
- -v, --version
show program’s version number and exit
Configuration Management#
To manage configurations, autrainer create, autrainer list, and autrainer show allow for the creation of the project structure and the discovery as well as saving of default configurations provided by autrainer.
Tip
Default configurations can be discovered both through the CLI, the CLI wrapper, and the respective module documentation.
autrainer create#
usage: autrainer create [-h] [-e] [-a] [-f] [directories ...]
Create a new project with default configurations.
usage: autrainer create [-h] [-e] [-a] [-f] [directories ...]
Positional Arguments#
- directories
- Configuration directories to create. One or more of:
augmentation
dataset
model
optimizer
plotting
preprocessing
scheduler
Named Arguments#
- -e, --empty
Create an empty project without any configuration directory.
- -a, --all
Create a project with all configuration directories.
- -f, --force
Force overwrite if the configuration directory already exists.
autrainer list#
usage: autrainer list [-h] [-l] [-g] [-p P] directory
List local and global configurations.
usage: autrainer list [-h] [-l] [-g] [-p P] directory
Positional Arguments#
- directory
- The directory to list configurations from. Choose from:
augmentation
dataset
model
optimizer
plotting
preprocessing
scheduler
Named Arguments#
- -l, --local-only
List local configurations only.
- -g, --global-only
List global configurations only.
- -p, --pattern
Glob pattern to filter configurations.
autrainer show#
usage: autrainer show [-h] [-s] [-f] directory config
Show and save a global configuration.
usage: autrainer show [-h] [-s] [-f] directory config
Positional Arguments#
- directory
- The directory to list configurations from. Choose from:
augmentation
dataset
model
optimizer
plotting
preprocessing
scheduler
- config
The global configuration to show. Configurations can be discovered using the ‘autrainer list’ command.
Named Arguments#
- -s, --save
Save the global configuration to the local conf/ directory.
- -f, --force
Force overwrite local configuration if it exists in combination with -s/–save.
Preprocessing#
To avoid race conditions when using Launcher Plugins that may run multiple training jobs in parallel, autrainer fetch and autrainer preprocess allow for downloading and preprocessing of Datasets (and pretrained model states) before training.
Both commands are based on the main configuration file (e.g. conf/config.yaml
),
such that the specified models and datasets are fetched and preprocessed accordingly.
If a model or dataset is already fetched or preprocessed, it will be skipped.
autrainer fetch#
usage: autrainer fetch [-h] [-b]
Fetch the datasets and models specified in a training configuration (Hydra). For more information on Hydra’s command line line flags, see: https://hydra.cc/docs/advanced/hydra-command-line-flags/.
usage: autrainer fetch [-h] [-l]
Named Arguments#
- -l, --cfg-launcher
Use the launcher specified in the configuration instead of the Hydra basic launcher. Defaults to False.
autrainer preprocess#
usage: autrainer preprocess [-h] [-b] [-n N] [-p P] [-s]
Launch a data preprocessing configuration (Hydra). For more information on Hydra’s command line line flags, see: https://hydra.cc/docs/advanced/hydra-command-line-flags/.
usage: autrainer preprocess [-h] [-l] [-n N] [-u F]
Named Arguments#
- -l, --cfg-launcher
Use the launcher specified in the configuration instead of the Hydra basic launcher. Defaults to False.
- -n, --num-workers
Number of workers (threads) to use for preprocessing. Defaults to 1.
- -u, --update-frequency
Frequency of progress bar updates for each worker (thread). If 0, the progress bar will be disabled. Defaults to 1.
Training#
Training is managed by autrainer train, which starts the training process
based on the main configuration file (e.g. conf/config.yaml
).
autrainer train#
usage: autrainer train [-h]
Launch a training configuration (Hydra). For more information on Hydra’s command line line flags, see: https://hydra.cc/docs/advanced/hydra-command-line-flags/.
usage: autrainer train [-h]
Inference#
autrainer inference allows for the (sliding window) inference of audio data using a trained model.
Both local paths and Hugging Face Hub links are supported for the model. Hugging Face Hub links are automatically downloaded and cached in the torch cache directory.
The following syntax is supported for Hugging Face Hub links: hf:repo_id[@revision][:subdir]#local_dir
.
This syntax consists of the following components:
hf
: The Hugging Face Hub prefix indicating that the model is fetched from the Hugging Face Hub.repo_id
: The repository ID of the model consisting of the user name and the model card name separated by a slash (e.g.autrainer/example
).revision
(optional): The revision as a commit hash, branch name, or tag name (e.g.main
). If not specified, the latest revision is used.subdir
(optional): The subdirectory of the repository containing the model directory (e.g.AudioModel
). If not specified, the model directory is automatically inferred. If multiple models are present in therepo_id
,subdir
must be specified, as the correct model cannot be automatically inferred.local_dir
(optional): The local directory to which the model is downloaded to (e.g..hf_local
). If not specified, the model is placed in the torch hub cache directory.
For example, to download the model from the repository autrainer/example
at the revision main
from the subdirectory AudioModel
and save it to the local directory .hf_local
,
the following autrainer inference CLI command can be used:
autrainer inference hf:autrainer/example@main:AudioModel#.hf_local input/ output/ -d cuda:0
Tip
To access private repositories, the environment variable HF_HOME
should point to the
Hugging Face User Access Token.
To use a custom endpoint (e.g. for a self-hosted hub),
the environment variable HF_ENDPOINT
should point to the desired endpoint URL.
To use a local model path, the following autrainer inference CLI command can be used:
autrainer inference /path/to/AudioModel input/ output/ -d cuda:0
autrainer inference#
usage: autrainer inference [-h] [-c C] [-d D] [-e E] [-r] [-emb] [-p P] [-w W] [-s S] [-m M] [-sr SR] model input output
Perform inference on a trained model.
usage: autrainer inference [-h] [-c C] [-d D] [-e E] [-r] [-emb] [-u F] [-p P]
[-w W] [-s S] [-m M] [-sr SR]
model input output
Positional Arguments#
- model
Local path to model directory or Hugging Face link of the format: hf:repo_id[@revision][:subdir]#local_dir. Should contain at least one state subdirectory, the model.yaml, file_handler.yaml, target_transform.yaml, and inference_transform.yaml files.
- input
Path to input directory. Should contain audio files of the specified extension.
- output
Path to output directory. Output includes a YAML file with predictions and a CSV file with model outputs.
Named Arguments#
- -c, --checkpoint
Checkpoint to use for evaluation. Defaults to ‘_best’ (on validation set).
- -d, --device
CUDA-enabled device to use for processing. Defaults to ‘cpu’.
- -e, --extension
Type of file to look for in the input directory. Defaults to ‘wav’.
- -r, --recursive
Recursively search for files in the input directory. Defaults to False.
- -emb, --embeddings
Extract embeddings from the model in addition to predictions.For each file, a .pt file with embeddings will be saved.Defaults to False.
- -u, --update-frequency
Frequency of progress bar updates. If 0, the progress bar will be disabled. Defaults to 1.
- -p, --preprocess-cfg
Preprocessing configuration to apply to input. Can be a path to a YAML file or the name of the preprocessing configuration in the local or autrainer ‘conf/preprocessing’ directory. If ‘default’, the default preprocessing configuration used during training will be applied. If ‘None’, no preprocessing will be applied. Defaults to ‘default’.
- -w, --window-length
Window length for sliding window inference in seconds. If None, the entire input will be processed at once. Defaults to None.
- -s, --stride-length
Stride length for sliding window inference in seconds. If None, the entire input will be processed at once. Defaults to None.
- -m, --min-length
Minimum length of audio file to process in seconds. Files shorter than the minimum length are padded with zeros. Sample rate has to be specified for padding. If None, no minimum length is enforced. Defaults to None.
- -sr, --sample-rate
Sample rate of audio files in Hz. Has to be specified for sliding window inference. Defaults to None.
Postprocessing#
Postprocessing allows for the summarization, visualization, and aggregation of the training results using autrainer postprocess. Several cleanup utilities are provided by autrainer rm-failed and autrainer rm-states. Manual grouping of the training results can be done using autrainer group.
autrainer postprocess#
usage: autrainer postprocess [-h] [-m N] [-a A [A ...]] results_dir experiment_id
Postprocess grid search results.
usage: autrainer postprocess [-h] [-m N] [-a A [A ...]]
results_dir experiment_id
Positional Arguments#
- results_dir
Path to grid search results directory.
- experiment_id
ID of experiment to postprocess.
Named Arguments#
- -m, --max-runs
Maximum number of best runs to plot.
- -a, --aggregate
- Configurations to aggregate. One or more of:
augmentation
batch_size
dataset
iterations
learning_rate
model
optimizer
scheduler
seed
autrainer rm-failed#
usage: autrainer rm-failed [-h] [-f] results_dir experiment_id
Delete failed runs from an experiment.
usage: autrainer rm-failed [-h] [-f] results_dir experiment_id
Positional Arguments#
- results_dir
Path to grid search results directory.
- experiment_id
ID of experiment to postprocess.
Named Arguments#
- -f, --force
Force deletion of failed runs without confirmation. Defaults to False.
autrainer rm-states#
usage: autrainer rm-states [-h] [-b] [-r R [R ...]] [-i I [I ...]] results_dir experiment_id
Delete states (.pt files) from an experiment.
usage: autrainer rm-states [-h] [-b] [-r R [R ...]] [-i I [I ...]]
results_dir experiment_id
Positional Arguments#
- results_dir
Path to grid search results directory.
- experiment_id
ID of experiment to postprocess.
Named Arguments#
- -b, --keep-best
Keep best states. Defaults to True.
- -r, --keep-runs
Runs to keep.
- -i, --keep-iterations
Iterations to keep.
autrainer group#
usage: autrainer group [-h]
Launch a manual grouping of multiple grid search results (Hydra). For more information on Hydra’s command line line flags, see: https://hydra.cc/docs/advanced/hydra-command-line-flags/.
usage: autrainer group [-h]