Documentation Index
Fetch the complete documentation index at: https://training-docs.cerebras.ai/llms.txt
Use this file to discover all available pages before exploring further.
The YAML specification changed after version 2.2 to make use of the Trainer class. You can use the converter tool to convert any legacy YAML to work with later versions:
cszoo config convert_legacy /path/to/v1.yaml -o /path/to/v2.yaml
Parameter Index
If you have a Legacy YAML specification and want to find out how to specify a specific parameter in the Trainer YAML specification, please use the tabs below to find the Legacy and Trainer specifications for each parameter.
eval_input:
micro_batch_size: ...
model
model.compression
model.fp16_type
model.lora_params
model.mixed_precision
model:
mixed_precision: ...
model.selective_grad
model:
selective_grad ...
optimizer
optimizer.grad_accum_steps
optimizer:
grad_accum_steps: ...
optimizer.initial_loss_scale
optimizer:
initial_loss_scale: ...
optimizer.learning_rate
optimizer:
learning_rate:
...
optimizer.log_summaries
optimizer:
log_summaries: ...
optimizer.loss_scaling_factor
optimizer:
loss_scaling_factor: ...
optimizer.max_gradient_norm
optimizer:
max_gradient_norm: ...
optimizer.max_gradient_value
optimizer:
max_gradient_value: ...
optimizer.max_loss_scale
optimizer:
max_loss_scale: ...
optimizer.min_loss_scale
optimizer:
min_loss_scale: ...
optimizer.steps_per_increase
optimizer:
steps_per_increase: ...
runconfig.act_memory_gi
runconfig:
act_memory_gi: ...
runconfig.autoload_last_checkpoint
runconfig:
autoload_last_checkpoint: ...
runconfig.check_loss_values
runconfig:
check_loss_values: ...
runconfig.checkpoint_path
runconfig:
checkpoint_path: ...
runconfig.checkpoint_steps
runconfig:
checkpoint_steps: ...
runconfig.cmd_memory_gi
runconfig:
cmd_memory_gi: ...
runconfig.compile_crd_memory_gi
runconfig:
compile_crd_memory_gi: ...
runconfig.compile_dir
runconfig:
compile_dir: ...
runconfig.compile_only
runconfig:
compile_only: ...
runconfig.credentials_path
runconfig:
credentials_path: ...
runconfig.debug_args
runconfig:
debug_args:
...
runconfig.debug_args_path
runconfig:
debug_args_path: ...
runconfig.disable_strict_checkpoint_loading
runconfig:
disable_strict_checkpoint_loading: ...
runconfig.disable_version_check
runconfig:
disable_version_check: ...
runconfig.dist_backend
runconfig:
dist_backend: ...
runconfig.drop_data
runconfig:
drop_data: ...
runconfig.dump_activations
runconfig:
dump_activations: ...
runconfig.enable_act_frequency
runconfig:
enable_act_frequency: ...
runconfig.enable_distributed
runconfig:
enable_distributed: ...
runconfig.eval_frequency
runconfig:
eval_frequency: ...
runconfig.eval_steps
runconfig:
eval_steps: ...
runconfig.execute_crd_memory_gi
runconfig:
execute_crd_memory_gi: ...
runconfig.experimental.listeners
runconfig:
experimental:
listeners:
...
runconfig.init_method
runconfig:
init_method: ...
runconfig.job_labels
runconfig:
job_labels:
...
runconfig.job_priority
runconfig:
job_priority: ...
runconfig.job_time_sec
runconfig:
job_time_sec: ...
runconfig.lazy_initialization
runconfig:
lazy_initialization: ...
runconfig.load_checkpoint_states
runconfig:
load_checkpoint_states: ...
runconfig.log_initialization
runconfig:
log_initialization: ...
runconfig:
log_input_summaries: ...
runconfig.log_steps
runconfig:
log_steps: ...
runconfig.logging
runconfig.main_process_id
runconfig:
main_process_id: ...
runconfig.max_checkpoints
runconfig:
max_checkpoints: ...
runconfig.max_steps
runconfig:
max_steps: ...
runconfig.mgmt_address
runconfig:
mgmt_address: ...
runconfig.mgmt_namespace
runconfig:
mgmt_namespace: ...
runconfig.model_dir
runconfig:
model_dir: ...
runconfig.mount_dirs
runconfig:
mount_dirs:
...
runconfig.num_act_servers
runconfig:
num_act_servers: ...
runconfig.num_csx
runconfig.num_epochs
runconfig:
num_epochs: ...
runconfig.num_steps
runconfig:
num_steps: ...
runconfig.num_wgt_servers
runconfig:
num_wgt_servers: ...
runconfig.num_workers_per_csx
runconfig:
num_workers_per_csx: ...
runconfig.op_profiler_config
runconfig:
op_profiler_config:
...
runconfig.precision_opt_level
runconfig:
precision_opt_level: ...
runconfig.python_paths
runconfig:
python_paths:
...
runconfig.retrace_every_iteration
runconfig:
retrace_every_iteration: ...
runconfig.save_initial_checkpoint
runconfig:
save_initial_checkpoint: ...
runconfig.seed
runconfig.steps_per_epoch
runconfig:
steps_per_epoch: ...
runconfig.sync_batchnorm
runconfig:
sync_batchnorm: ...
runconfig.target_device
runconfig:
target_device: ...
runconfig.transfer_processes
runconfig:
transfer_processes: ...
runconfig.validate_only
runconfig:
validate_only: ...
runconfig.wgt_memory_gi
runconfig:
wgt_memory_gi: ...
runconfig.wrk_memory_gi
runconfig:
wrk_memory_gi: ...
runconfig.wsc_log_level
runconfig:
wsc_log_level:
...
sparsity
sparsity.add_summaries
sparsity:
add_summaries: ...
train_input:
micro_batch_size: ...
wandb.group
wandb.job_type
wandb.project
wandb.resume
wandb.run_id
wandb.run_name