Connecting to the Cerebras Cloud¶

Generating an SSH Key¶

If you do not already have an SSH key, generate one using the following command:

ssh-keygen -t rsa -C "your_email@example.com"

Follow the prompts:

Accept the default location or specify a different one.
Optionally, set a passphrase (if set, do not forget it).

Example output

Generating public/private key pair.
Enter file in which to save the key ($HOME/.ssh/id_key): <<< The default location is fine.
Enter passphrase (empty for no passphrase): <<< If you set it, you cannot forget it.
Enter same passphrase again:
Your identification has been saved in $HOME/.ssh/id_KEY <<< Private key. Don’t share it.
Your public key has been saved in $HOME/.ssh/id_KEY.pub <<< Public key. Please share it.

The key fingerprint is:
SHA256:cGoO3HowD3pcViA74UMz1oGNrzkDssLJom8cdA27JlM your_email@example.com
The key's randomart image is:
+-----[ KEY  ]------+
|   B=o.        |
|   ++*o.       |
|   ==. o       |
|. o.E+o=       |
|o+.+*+* S      |
|+++oB%         |
|+..=+o+        |
|. o. .         |
| o.            |
+----[SHA256]-------+

After generation, display the public key:

cat $HOME/.ssh/id_*.pub

Share only your public key with the Neocortex team while keeping your private key secure.

For more information, please visit the SSH Project webpage.

Connecting to the Cerebras system¶

Once your access is set up, you will receive:

Cerebras Cloud Credentials
VPN Configuration Instructions

Please have in mind that your Cerebras credentials will be used for connecting to their VPN endpoint, and then the SSH connection will use the private SSH key generated by you.

VPN Connection¶

Download the GlobalProtect VPN client from: https://access01.vpn.cerebras.net
Use your Cerebras-provided VPN credentials to log in.
Configure the VPN with Portal Address: access01.vpn.cerebras.net
Connect to the VPN.

SSH Connection¶

After establishing a VPN connection, access the system via SSH:

ssh -i $HOME/.ssh/id_KEY <cerebras_username>@cg3-us27.dfw1.cerebrascloud.com

Replace <cerebras_username> with your assigned username and id_KEY with your private key.

To verify VPN connectivity: ping cg3-us27.dfw1.cerebrascloud.com

Example output:

ping cg3-us27.dfw1.cerebrascloud.com

PING cg3-us27.dfw1.cerebrascloud.com (172.16.4.77): 56 data bytes
64 bytes from 172.16.4.77: icmp_seq=0 ttl=62 time=153.591 ms
64 bytes from 172.16.4.77: icmp_seq=1 ttl=62 time=157.990 ms
64 bytes from 172.16.4.77: icmp_seq=2 ttl=62 time=151.645 ms
64 bytes from 172.16.4.77: icmp_seq=3 ttl=62 time=151.368 ms

--- cg3-us27.dfw1.cerebrascloud.com ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 151.368/153.649/157.990/2.649 ms

Explanation of Terms in the Cerebras Compile Report¶

When submitting jobs, the Cerebras Compile Report provides insights into:

Model Compilation Time: Duration required for the model to be compiled.
Resource Allocation: CS-3 systems allocated for the job.
Memory Utilization: Reports the efficiency of memory usage.
Execution Status: Whether the job is QUEUED, RUNNING, FAILED, or COMPLETED.
Optimization Suggestions: Any recommendations to enhance efficiency.

Job Submission and Monitoring Procedures¶

Submitting a Job¶

Each project has a dedicated directory for training jobs, for example, /cra-XYZ/demo/trials. To submit a job:

Navigate to the directory of the desired model.
Run the experiment script: bash run.sh

Monitoring Jobs¶

To check job status: csctl get jobs -a

Running jobs will have a 'RUNNING' status, and queued jobs will have a 'QUEUED' status.

Monitoring with TensorBoard¶

To visualize training progress: tensorboard --logdir=. --bind_all --port 6006

Access TensorBoard from your browser:¶

Default link: http://cg3-us27.dfw1.cerebrascloud.com:6006
If inaccessible, try using the IP address: http://172.16.4.243:6006/

Killing a Job¶

To terminate a running job: csctl cancel job <jobID>

To find <jobID>, using csctl get jobs -a

Resource Utilization Best Practices¶

Use tmux to avoid job termination due to disconnection: tmux new -s my_session
Activate the Cerebras Virtual Environment before running jobs: source /cra-XYZ/venvs/2.4.0/bin/activate
Submit jobs in advance if running a model for the first time, as compilation may take time.
Store data properly in /cra-XYZ to ensure access.

Neocortex Slack¶

Please take a look at the Neocortex System Slack section for information as to how to connect to our Slack space, and use it to to advance your project. In there, you can get:

Official updates from the Neocortex team.
Private project channels for collaboration.
Discussions for AI/ML projects.
Discussions for SDK/HPC projects.

Process for Requesting Support or Additional Resources¶

Please refer to the Getting Help section. You could reach out to us over email (neocortex@psc.edu), Slack, or schedule an office hour. We will be happy to help.

Support Channels¶

Email: Reach out to the support team by emailing neocortex@psc.edu.
Slack: Post in the appropriate channel or DM a team member.
Office Hours: Schedule a session with the support team.

Requesting Additional Resources¶

To request additional compute resources, submit a formal request to neocortex@psc.edu including:

Project Name
Justification for Additional Resources
Expected Usage Period
Desired Configuration

Requests will be reviewed based on system availability.