Trainer
class. Multi-Phase training allows you to combine multiple training phases with different batch sizes or max sequence lengths in a single config file or python script.
Trainer
.
Let’s consider an example. In the Pretraining with Upstream Validation, you’ve learned how to construct the Trainer for the Llama-3 model. Now, let’s add a new training phase with a different batch size and new max sequence length.
To define each phase you need to construct a separate Trainer
instance. For example:
Trainer
instance for each phase, which adds some overhead to your run due to time spent on compile and weights transfer. If you are using Python API, you can construct a single Trainer
object and call fit
using different DataLoader
objects.CosineDecayLR
to ConstantLR
. To accomplish this, you need to create two instances of the Trainer
and carefully manage checkpoint loading between phases to account for the changes in model parameters.
Trainer
constructs and compiles a model where in the second phase we changed the scheduler to ConstantLR
, so to avoid any issues with checkpoint loading we specify which parameters needs to be loaded. For further reading please follow Checkpointing.
Trainer
, you only instantiate a single backend. For example: