Error Receiving Activation
cerebras.appliance.errors.ApplianceUnknownError: Ran into error while receiving activation tensor <custom-call …>
Users may see this error when running their own models on the Cerebras System.
Observed Error
The error message that will be shown on the command line will be similar to the following:
Explanation
There are many possible causes for the above error, but there are some things the user can do to rule out certain problems.
Placement of custom dataloader
When creating run scripts for custom model runs, it is currently necessary to separate the dataloader into its own file in the same directory as the main execution or model script. The exception can be caused when the dataloader is in the run script, and the input workers cannot pickle the desired input function that is coming from the __main__
module.
Work around
Placement of custom dataloader
A user porting their own model should separate the dataloader into a file separate from the main entrypoint of the training run script. Although not the only option, one example is shown below of an acceptable directory structure.
For more information on how to port user models on the Cerebras System, please see Overview of the Cerebras Model Zoo.