Trainer
in the model directory.
model_dir
argument to the Trainer
’s constructor.
TensorBoardLogger
. So, you can see the event files that were written by the TensorBoard writer.
cerebras_logs
directory in which various logs and artifacts from the compilation and execution are stored. These logs/artifacts are also divided up by datetime (the same datetime as the above mentioned subdirectory) so that you know which logs/artifacts belong to which run.
Finally, you can see that checkpoints taken during the run are saved in the model directory. These are stored in the base model directory so that future runs with checkpoint autoloading enabled can easily pick them up (see Checkpointing for more details).
Trainer
in some core workflows, you can check out:
To learn more about how you can extend the capabilities of the Trainer
class, you can check out: