Neocortex - CS IP Addresses
We recommended using the system variable ${CS_IP_ADDR}
instead of the actual CS machine
IP address every time you need that value. For example when specifying the --cs_ip
flag value when running jobs.
Good: --cs_ip ${CS_IP_ADDR}
Not recommended: --cs_ip 1.2.3.4
Please feel free to contact us at neocortex@psc.edu for any questions.
Current Goals/Action Items
- Go over the "Getting Ready to Use the Neocortex System" training files, available under the
"Resources" section below.
-
Submit and successfully execute a training job in the CS-2 servers (following instructions in the
"Getting Ready to Use the Neocortex System" training files.
- Use Bridges-2 to gather the metrics required as described in the "Key Compilation Metrics
Needed" section
below.
-
Share the key compilation metrics with the Neocortex team and gain full access to the SDFlex
and the CS-2 servers.
- Reserve a spot here
for a project checkpoint session, as needed.
Resources:
-
The Neocortex
System Slack Organization is now available. Please feel free to join if you want to communicate
with other project team members through Slack.
- The Neocortex
Documentation. This is a living document. It has detailed instructions on how to use the
Neocortex system, including step-by-step examples for training MNIST on the CS-2s and
compiling code on either Neocortex or Bridges-2.
-
Cerebras Systems
Overview: Cerebras whitepaper on the CS-2 system and environment.
- Cerebras Model Zoo
R_1.6.0: GitHub repository with the Cerebras modelzoo. It includes sample models in
TensorFlow ready to run on the CS-2.
-
Research Plan:
We ask you to generate the key
metrics values and a research
plan for your Neocortex application before we grant you complete access to your Neocortex
system allocation. For the key metrics, please follow the instructions
below to port your own code and compile it to generate key metrics values. This will verify that
your team is ready to successfully run your applications on the CS-2.
-
Getting ready to use the Neocortex system training:
-
Previous Trainings
-
Webinar - Neocortex: CS-2 Overview. 2022-03-29
-
Cerebras Documentation version
1.6.0: Cerebras documentation portal with information about the CS-2, conceptual guides,
step-by-step tutorials, best practices, release notes and FAQs. Please note that some instructions
might not be directly applicable to the Neocortex system because of a different setup with
SDFlex and slurm configuration.
- Cerebras Discourse forum:
Feel free to use the Cerebras Discourse platform to ask short or open-ended questions that will be
visible to the whole community, similar to Quora or Stack Overflow. A registration is needed, but
Neocortex users should be enabled to access the platform without problems.
- Cerebras Developer
Blog
As a reminder, the compute resources you have access to right now are:
- Bridges-2 GPU partition
- Bridges-2 GPU-AI partition
- Bridges-2 EM partition
- Bridges-2 RM partition
- Ocean file system
Once you share the "Key Compilation Metrics" for your code, your team would gain access to:
- Neocortex SDF partition
- Neocortex CS-2 partition
Please complete the "Getting ready to use the Neocortex system" training and
submit the "Key Compilation Metrics Needed" form to get full access to the Superdome Flex and CS-2s
systems.
Note: You can get more details regarding your allocation by running the
projects
command.
Key Compilation Metrics Needed for ML/AI Projects
Before using the CS-2 and SDFlex servers, we ask that you collect representative key metrics from your
dataset and your model. These metrics help inform the Neocortex program and also signal that your code is
ready to be executed on the Cerebras systems.
To learn how to capture these metrics, please visit the section
Compilation
key metrics to share
in the
Documentation.
We invite you to record the following metrics by using the
Neocortex Project Key Metrics file and submitting it to the Key-compilation-metrics
Submit Box link:
- Ratio of utilized components (as indicated in the Documentation).
- Cycles per sample (as indicated in the Documentation).
- Data sample size (in MB or GB).
- Number of samples in the training dataset.
- Maximum expected batch size (in number of samples or another representative).
Please use the Bridges-2 supercomputer to obtain these values. Remember to find instructions on
measuring these numbers in the documentation.
The Neocortex team