Series of decoder-only transfomer LLMs from Meta.
/llama
directory within ModelZoo. Here’s how it’s organized:
gpt2_model.py
.LLaMa 3
Configuration | Description |
---|---|
params_llama3p1_70b_msl_128k.yaml | A 70B parameter model with a maximum sequence length of 128K, configured as described in the LLaMa 3.1 blog. |
params_llama3p1_70b_msl_8k.yaml | A 70B parameter model with a maximum sequence length of 8K, configured as described in the LLaMa 3.1 blog. |
params_llama3p1_8b_msl_128k.yaml | A 8B parameter model with a maximum sequence length of 128K, configured as described in the LLaMa 3.1 blog. |
params_llama3p1_8b_msl_32k_swa_8k_sink_512.yaml | A 8B parameter model with a maximum sequence length of 32K, SWA starting at 8K, and sink tokens set to 512. Configured as described in the LLaMa 3.1 blog. |
params_llama3p1_8b_msl_8k.yaml | A 8B parameter model with a maximum sequence length of 8K, configured as described in the LLaMa 3.1 blog. |
LLaMa-2
Configuration | Description |
---|---|
params_llama2_7b.yaml | A 7B parameter model configured as described in the LLaMa-2 paper. |
params_llama2_13b.yaml | A 13B parameter model configured as described in the LLaMa-2 paper. |
params_llama2_70b.yaml | A 70B parameter model configured as described in the LLaMa-2 paper. |
Code LLaMa
Configuration | Description |
---|---|
params_code_llama_7b.yaml | A 7B parameter model configured as described in the Code LLaMa paper. |
params_code_llama_70b.yaml | A 70B parameter model configured as described in the Code LLaMa paper. |
WizardLM
Configuration | Description |
---|---|
params_wizardlm_13b.yaml | A 13B parameter model configured as described in the WizardLM paper. |