Follow this guide to fine-tune your first model on a Cerebras system.
Create Model Directory & Copy Configs
cp
here to copy configs specifically designed for this tutorial. For general use with Model Zoo models, we recommend using cszoo config pull
. See the CLI command reference for details.Inspect Configs
Model Config
Evaluation Config
winogrande
on a single CSX system.If you are interested, you can learn more about validating models using the Eleuther or BigCode Evaluation Harness in our documentation.Data Config
Preprocess Data
finetuning_tutorial/train_data/
and finetuning_tutorial/valid_data/
(see the output_dir
parameter in your data configs).KeyError: 'tags'
This issue occurs due to an outdated version of the huggingface_hub
package. To resolve it, update the package to version 0.26.1 by running:pip install --upgrade huggingface_hub==0.26.1
Inspect Preprocessed Data (optional)
http://172.31.48.239:5000
. Copy and paste this into your browser to launch TokenFlow, a tool for interactively visualizing whether loss and attention masks were applied correctly:Download Checkpoint and Configs
finetuning_tutorial/from_hf
directory:Convert Checkpoint and Configs
finetuning_tutorial/from_hf
folder should now contain:pytorch_model_to_cs-2.3.mdl
: The converted model checkpoint.config_to_cs-2.3.yaml
: The converted configuration file.ckpt_path
in your finetuning_tutorial/model_config.yaml
to the location of this converted checkpoint.Train and Evaluate Model
train_dataloader.data_dir
and val_dataloader.data_dir
in your model config to the absolute paths of your preprocessed data:finetuning_tutorial/model
folder (see the model_dir
parameter in your model config). These include:Run Evaluation Tasks
Port Model to Hugging Face
finetuning_tutorial/to_hf
.Validate checkpoint and configs (optional)