Learn how Micro Batch Size (MBS) works under the hood, how the platform picks or overrides it, and how to optimize it manually.
micro_batch_size
that:
micro_batch_size
isn’t optimal or feasible.micro_batch_size
is valid and what happens if it isn’t. Values can be automatically overwritten or result in an error.
batch_size
= 133num_csx
= 1micro_batch_size
= 34Ceil(133/1) = 133
NumMicroBatches
= {1, 2, 3, …, 133}NumMicroBatches
= Ceil(133/34) = Ceil(3.912) = 4
batch_size
= 673num_csx
= 2micro_batch_size
= 168Ceil(673/2) = 337
NumMicroBatches
= {1, 2, 3, …, 337}INFO: The micro batch size is changed to 169 to allow approximately even distribution across boxes and gradient accumulation iterations
NumMicroBatches
= Ceil(337/169) = Ceil(1.994) = 2
batch_size
= 240num_csx
= 2micro_batch_size
= 121Ceil(240/2) = 120
NumMicroBatches
= {1, 2, 3, …, 120}NumMicroBatches
, so you will see the following error message:
ERROR: <unknown>:0:error: Minimum microbatch size 121 must be smaller or equal to the per-box batch size 120 where the number of CSX boxes is 2