Learn how to use Grafana to monitor your Cerebras cluster.
replica_id
of this replica_type
in each chart. Replica_type
represents a type of service process for a given job. It can be one of these types: weight, command, activation, broadcastreduce, chief, worker, coordinator. Replica_id
corresponds to the specific replica for a job and a replica type
replica_type
and replica_id
8443
to forward the traffic. You can choose any unoccupied port on your machine.grafana.CLUSTER-NAME.DOMAIN.com
For example: grafana.mb-systemf102.cerebras.com
/opt/cerebras/certs/grafana_tls.crt
on the user node. This certificate is copied during user node installation process. Download this certificate to your local machine and add this certificate to your browser keychain.
grafana-tls.crt
into System keychain certificates. Make sure to set Always Trust when using this certificate.
/etc/hosts
file to point the IP of the user node to Grafana: <USERNODE_IP> grafana.<cluster-name>.<domain>.com
https://grafana.<cluster-name>.<domain>.com
replica_id
of this replica_type
in each chart. Replica_type
represents a type of service process for a given job. It can be one of these types: weight, command, activation, broadcastreduce, chief, worker, and coordinator.
The following figure shows that weight servers achieve a maximum network transmit speed of ~33 MB/s: