Decoder-only language models by Google DeepMind, using interleaved attention and GQA for high-quality performance at practical scale.
/gemma2
directory within ModelZoo. Here’s how it’s organized:
gpt2_model.py
.Configuration | Description |
---|---|
params_gemma2_9b_msl8k.yaml | 9B parameter Gemma 2 model with 8K MSL. |
params_gemma2_9b_msl8k_swa_4k_sink_512.yaml | Variant of the 9B model using 4K sliding window attention and 512 sink tokens. |
params_gemma2_27b_msl8k.yaml | 27B parameter Gemma 2 model with 8K MSL. |