Model Zoo Usage Examples
Set Up the Cerebras Environment
To set up your Cerebras environment for the Cerebras Model Zoo, follow the instructions provided in the ../Getting-started/Setup-installation page.
Running an Existing Model
To execute an existing model from the Cerebras Model Zoo, follow these steps:
1. Use the CLI to query the registry and display the supported models:
2. Find the model implementation. For example, to locate the path for GPT-2:
For example, to locate the Model Zoo path for GPT-2:
3. Determine which YAML file to use for the model’s parameters. The YAML configurations are located in the configs
directory within the model’s folder. For example, GPT-2’s YAML files can be found at:
4. Execute the run.py
script, supplying the appropriate YAML file as an argument.
Editing Configurations
If you need to modify existing configurations, ensure you have cloned the Model Zoo repository for write access to the YAML files. If you want to modify how a run is configured for a specific Model Zoo model, make sure you have first cloned the Model Zoo repository for write access to the YAML files. All reference configuration files in Model Zoo are located in the configs/
directory.
Querying Additional Components
To see which losses and dataloaders are provided in the reference examples, you can query the registry using:
Using Config Classes with an Existing Model
Every model within the Model Zoo is equipped with a corresponding Config class. When a Config class is associated with a model, the configuration is automatically validated in the backend, necessitating no additional actions from the user. For a deeper understanding of Config classes, you can explore further details in Model Zoo Config Classes.
Creating and Registering a New Dataloader
To create a new dataloader, follow the instructions here. This will help you ensure compatibility with the Cerebras system. Next, we’ll cover how to register your new dataloader.
Registering Your New Dataloader
For the system to recognize and utilize your new dataloader, you need to register it within the Model Zoo’s registry framework.
Follow these steps to register your dataloader:
1. Create an __init__.py file
Ensure that there is an __init__.py
file in the directory where your dataloader’s code resides. This file makes the directory a Python package, allowing its contents to be imported elsewhere.
2. Add registration code to __init__.py
Within the __init__.py
file, include the following code to register your dataloader:
This code snippet accomplishes the following:
-
It imports the necessary modules, including the registry from the Model Zoo’s common utilities.
-
It determines the path of the current directory (current_path) where the dataloader is located.
-
It registers this path with the registry under the “dataloader_path” category, allowing the system to detect and utilize your new dataloader.
By following these steps, you ensure that your new dataloader is properly integrated into the Model Zoo’s infrastructure, ready to be used in your machine learning workflows.
Adding and Registering a New Loss Function
To enhance the Model Zoo with your custom loss function, follow these steps to integrate and register it within the framework:
1. Implementation
Implement your new loss function and place the code in the <modelzoo path>/modelzoo/losses
directory. This location is where the Model Zoo looks for loss function implementations.
2. Registering the loss function
To make your new loss function recognizable and usable within the Model Zoo, you need to register it:
- Create or modify an __init__.py file
Ensure that an __init__.py
file exists in the directory where your loss function code is located. If it doesn’t exist, create one. If it does, you’ll be editing it.
- Add registration code
In the __init__.py
file, include the following code to register your loss function:
This code accomplishes the following tasks:
-
It imports the necessary modules, including the registry from the Model Zoo’s common utilities.
-
It determines the path of the current directory (current_path) where the loss function is located.
-
It registers this path with the registry under the “loss_path” category, enabling the system to detect and use your new loss function.
By following these instructions, your custom loss function becomes an integrated part of the Model Zoo, ready to be utilized in various training workflows.
Creating and Registering a New Model
To develop and integrate a new model with the Cerebras Model Zoo, see the documentation here.
Registering Your New Model
To ensure the Model Zoo recognizes your new model:
1. Create or update an __init__.py file
Ensure there’s an __init__.py
file in the directory where your model.py is located. If the file isn’t there, create one.
2. Add registration code
In the __init__.py
file, insert the following code to register your model:
Ensure the model’s name matches the directory name containing model.py
. For example, if your model is named “tinybert”, its code should reside in a directory named “tinybert/”.
Evaluating Your Model
To effectively evaluate your model during and after training, follow these guides:
Evaluating During Training
For insights on assessing your model’s performance throughout the training process, visit the run-model/eval guide in the Cerebras Developer Documentation. This resource provides comprehensive information on the steps and settings required to evaluate your model during training on the Wafer-Scale Cluster (WSC).
Using EleutherAI’s Evaluation Harness
If you’re working with Large Language Models (LLMs) within the Model Zoo, you might want to leverage EleutherAI’s Evaluation Harness (EEH) for a more in-depth evaluation. Our guide on downstream valudation using EEH offers detailed instructions on how to prepare your data and set up the EEH for evaluating LLMs. This tool provides a structured approach to assessing model performance across various benchmarks and tasks, facilitating a comprehensive evaluation of your LLM.
By following these guidelines, you can gain valuable insights into your model’s effectiveness, helping you make informed decisions for further model refinement and deployment.
Adding a New Dataset
To incorporate a new dataset for your model within the Cerebras Model Zoo, you’ll need to ensure it’s specified correctly in the model’s configuration and, for certain models like language models, converted into the appropriate format.
Specifying the Dataset in the YAML Configuration
1. Locate the YAML file
Identify the YAML configuration file associated with your model.
2. Update data_dir
Within the YAML file, under train_input
or eval_input
sections (depending on whether the dataset is for training or evaluation), specify the path to your dataset using the data_dir
entry.
Preparing Datasets for Language Models
Language models within the Cerebras ecosystem often require datasets in HDF5 format. If your dataset isn’t already in this format, follow these conversion steps:
1. PyTorch dataset to HDF5
If your dataset is in a PyTorch format, convert it to HDF5 by following the guidelines provided in prepare-data/hdf5_preprocessing guide. This documentation offers a step-by-step process for the conversion.
2. Raw data to HDF5
For raw data conversion to HDF5, particularly for GPT-style models, refer to the Chunk preprocessing guide. This resource outlines the necessary steps to preprocess and convert your data, ensuring it’s in the right format for model consumption.
By following these procedures, you can successfully add and utilize new datasets with your models in the Cerebras Model Zoo, enhancing the versatility and applicability of your machine learning projects.
Utilizing Checkpoints
The Cerebras Model Zoo includes a “Checkpoint and Config Converter” tool, designed to facilitate the conversion of model implementations between the Model Zoo and other frameworks or repositories. This tool is particularly useful for migrating models into the Model Zoo environment or exporting them for use in different settings. It also allows you to convert checkpoints created through running on previous Cerebras software releases to checkpoints compatible with new software releases.
Checkpoint conversion: Cerebras and HuggingFace, software version updates
To learn more about how to use this tool for converting checkpoints and model configurations, click here. This resource provides detailed instructions on using the converter, ensuring a smooth transition between different coding environments.
Saving and Loading Checkpoints
Proper checkpoint management is crucial for efficiently training and evaluating models. For guidelines on saving and loading checkpoints within the Cerebras environment, consult the checkpointing documentation. This section offers comprehensive insights into checkpoint handling, including saving states during training and loading them for resuming training or evaluation.
By leveraging these resources, you can effectively manage model checkpoints in the Cerebras ModelZoo, enhancing your model development and experimentation workflows.