Skip to main content

Documentation Index

Fetch the complete documentation index at: https://training-docs.cerebras.ai/llms.txt

Use this file to discover all available pages before exploring further.

We provide tools through the DataExecutor to profile its performance during the run. Currently, the supported activities can be profiled as follows:
ActivityDescription
total_samplesTotal number of samples processed so far
total_timeElapsed time so far, in seconds
rateClient side smoothed samples/second of all the samples added since last queried
global_rateNon-smoothed samples/second since the beginning of when the executor context was entered. For a more detailed explanation see Measure throughput of your model
samples_per_secNon-smoothed samples/second since the beginning of when the executor context was entered. This value is the same as global_rate
flops_utilizationReal flops utilization for the run
You can track activity performance using names and the DataExecutor profiler. For example:
executor = cstorch.utils.data.DataExecutor(...)
...
print(f"Total samples: {executor.profiler.rate_tracker.total_samples}")
print(f"Total time: {executor.profiler.rate_tracker.total_time}")
print(f"Rate: {executor.profiler.rate_tracker.rate}")
print(f"Global rate: {executor.profiler.rate_tracker.global_rate}")
print(f"Samples/sec: {executor.profiler.rate_tracker.samples_per_sec}")
print(f"Flops utilization: {executor.profiler.rate_tracker.flops_utilization}")