Skip to content

Running Jobs via Pegasus

Introduction

This page provides researchers with the necessary resources to learn about running jobs on Neocortex using the Pegasus Workflow Management System. Pegasus streamlines complex workflows, making it easier to manage and execute computational tasks on high-performance computing (HPC) resources like Neocortex.

By utilizing Pegasus, researchers can achieve multiple benefits:

  • Workflow Management Efficiency: Automate complex workflows, saving time and reducing errors.
  • Scalability: Manage workflows of varying sizes efficiently on Neocortex's parallel computing resources.
  • Reproducibility: Ensure the consistent and repeatable execution of workflows across different runs.
  • Data Provenance Tracking: Keep track of the origin and processing steps of data used in the workflow, enhancing accuracy and transparency.

Self-Guided Learning Resources

Before scheduling a specialized training session with the Pegasus team, we recommend familiarizing yourself with the following resources (mainly the ACCESS Pegasus Overview item):

  1. ACCESS Pegasus Overview: This resource offers a high-level introduction to Pegasus, its key features, and its benefits for research workflows.
  2. ACCESS Pegasus Documentation: This in-depth documentation delves into the practical aspects of using Pegasus on ACCESS, including specific configuration details and code examples relevant to Neocortex.
  3. Pegasus User Guide: The official Pegasus User Guide provides comprehensive documentation on Pegasus's functionalities, architecture, installation, and usage. This guide is a valuable resource for researchers who want a deeper understanding of the system.

Next Steps

Once you have reviewed the self-guided learning resources and feel comfortable with the basic concepts of Pegasus, you can schedule a specialized training session (office hour) with the Pegasus team for more advanced topics or specific Neocortex configurations.

Launching Pegasus Jobs on Neocortex

  • Step 1: Connect to Open OnDemand (OOD), following the instructions outlined in the Neocortex Open OnDemand section.
  • Step 2: Launch Jupyter Notebook
    1. Once logged in, you will be directed to the OnDemand dashboard. From here, navigate to the Interactive Apps menu.
    2. Under Interactive Apps, select Jupyter Notebook.
    3. Configure your session by specifying the desired options, such as the number of CPU cores, memory, and time limit. The number of hours can be set based on your needs (for example, 1 hour), and the Number of Nodes should be set to 1.
    4. In the "Extra Slurm Args" field, specify the "pegasus" partition with the following argument:
      --partition=pegasus
      
    5. Click Launch to start the Jupyter Notebook session.
    6. You are now ready to work with Jupyter Notebook on the Neocortex system.