Overview

The ModelZoo CLI is a comprehensive command-line interface that serves as a single entry point for all ModelZoo-related tasks. This tool streamlines various machine learning workflows, from data preprocessing to model training and validation.

Commands

Below is a list of commands that can be used with the ModelZoo CLI tool. Expand each section to see examples and more information.

Example Workflow: Pretraining a model using the ModelZoo CLI

This workflow guides you through the steps to pretrain a model using the Cerebras ModelZoo CLI. Follow these steps to set up your environment, preprocess data, and run the pretraining process.

Prerequisite: Before proceeding with the steps below, ensure that you have completed the setup and installation guide found here.

1

Create model directory

Create a directory to store all the files for this pretraining workflow and copy the necessary configuration files.

mkdir pretraining_tutorial
cp modelzoo/src/cerebras/modelzoo/tutorials/pretraining/* pretraining_tutorial
2

Preprocess the data

Preprocess the training and validation datasets using the provided configuration files.

cszoo data_preprocess run --config pretraining_tutorial/train_data_config.yaml
cszoo data_preprocess run --config pretraining_tutorial/valid_data_config.yaml
3

Run model

Run the pretraining process using the provided configuration.

cszoo fit pretraining_tutorial/model_config.yaml
4

Convert checkpoint to HuggingFace

Convert the trained model checkpoint into a HuggingFace-compatible format.

cszoo checkpoint convert \
  --model llama \
  --src-fmt cs-auto \
  --tgt-fmt hf \
  --config pretraining_tutorial/model_config.yaml \
  --output-dir pretraining_tutorial/to_hf \
  pretraining_tutorial/model/checkpoint_0.mdl

Getting Help

For detailed information about any command, use the --help flag:

cszoo --help
cszoo <command> --help

CSZoo Assistant

Need help? Our CSZoo Assistant is an LLM agent you can access from the command line with the assistant subcommand.

Use it to:

  • Ask questions: cszoo assistant "what is the checkpoint converter?"

  • Perform actions: cszoo assistant "convert my checkpoint from huggingface to cerebras"

CSZoo Assistant will always ask your permission before running a command.

Access to the Cerebras Inference API is required and you’ll need to provide your API key with the following command:

export CEREBRAS_API_KEY=<your api key>

Don’t have an API key? Follow these instructions.

CSZoo Assistant is a beta feature and it may make mistakes. Always double-check its reasoning and be aware of the following limitations:

  • CSZoo Assistant can currently only access the help manuals found with cszoo ... -h.

  • There are currently no advanced context length management mechanisms in place. The assistant will error out if it overflows the context length.