Overview

The ModelZoo CLI is a comprehensive command-line interface that serves as a single entry point for all ModelZoo-related tasks. This tool streamlines various machine learning workflows, from data preprocessing to model training and validation.

Commands

Below is a list of commands that can be used with the ModelZoo CLI tool. Expand each section to see examples and more information.

Example Workflow: Pretraining a model using the ModelZoo CLI

This workflow guides you through the steps to pretrain a model using the Cerebras ModelZoo CLI. Follow these steps to set up your environment, preprocess data, and run the pretraining process.
Prerequisite: Before proceeding with the steps below, ensure that you have completed the setup and installation guide found here.
1

Create model directory

Create a directory to store all the files for this pretraining workflow and copy the necessary configuration files.
mkdir pretraining_tutorial
cp modelzoo/src/cerebras/modelzoo/tutorials/pretraining/* pretraining_tutorial
2

Preprocess the data

Preprocess the training and validation datasets using the provided configuration files.
cszoo data_preprocess run --config pretraining_tutorial/train_data_config.yaml
cszoo data_preprocess run --config pretraining_tutorial/valid_data_config.yaml
3

Run model

Run the pretraining process using the provided configuration.
cszoo fit pretraining_tutorial/model_config.yaml
4

Convert checkpoint to HuggingFace

Convert the trained model checkpoint into a HuggingFace-compatible format.
cszoo checkpoint convert \
  --model llama \
  --src-fmt cs-auto \
  --tgt-fmt hf \
  --config pretraining_tutorial/model_config.yaml \
  --output-dir pretraining_tutorial/to_hf \
  pretraining_tutorial/model/checkpoint_0.mdl

Getting Help

For detailed information about any command, use the --help flag:
cszoo --help
cszoo <command> --help

CSZoo Assistant

Need help? Our CSZoo Assistant is an LLM agent you can access from the command line with the assistant subcommand. Use it to:
  • Ask questions: cszoo assistant "what is the checkpoint converter?"
  • Perform actions: cszoo assistant "convert my checkpoint from huggingface to cerebras"
CSZoo Assistant will always ask your permission before running a command.
Access to the Cerebras Inference API is required and you’ll need to provide your API key with the following command:export CEREBRAS_API_KEY=<your api key>Don’t have an API key? Follow these instructions.
CSZoo Assistant is a beta feature and it may make mistakes. Always double-check its reasoning and be aware of the following limitations:
  • CSZoo Assistant can currently only access the help manuals found with cszoo ... -h.
  • There are currently no advanced context length management mechanisms in place. The assistant will error out if it overflows the context length.