Falcon
Series of decoder-only transformer models by TII, with 7B, 40B, and 180B parameter models
Model Description
The Falcon series consists of causal decoder-only transformer models with 7B, 40B, and 180B parameters, developed by the Technology Innovation Institute (TII). The models follow an optimized GPT-style architecture with key changes for efficient scaling and throughput:
- Parallel attention and MLP layers within transformer blocks.
- Rotary positional embeddings (RoPE) and multigroup attention (a generalization of multiquery attention) for faster inference and better tensor parallelism.
- GELU activations, no dropout, and z-loss regularization for stable training.
- Context length of 2,048 tokens and a 65K vocabulary.
Code Structure
The code for this model is located in the /falcon
directory within ModelZoo. Here’s how it’s organized:
Our implementation of Falcon is built on top of our GPT-2 backbone. For more details, see gpt2_model.py
.
Available Configurations
Configuration | Description |
---|---|
params_falcon_7b.yaml | Falcon model with 7B parameters. |
params_falcon_40b.yaml | Falcon model with 40B parameters. |
params_falcon_180b.yaml | Falcon model with 180B parameters. |
Workflow
For example workflows using language models from the Cerebras Model Zoo, see our tutorials on pretraining and fine-tuning.
For a complete list of Cerebras ModelZoo CLI commands, see the command reference.
References
- The Falcon LLM Team (2023). The Falcon Series of Open Language Models