Difference between revisions of "Using HPC on AWS and Azure"

From DeepSense Docs
Jump to: navigation, search
m
(Add HPC Python Job file)
Line 25: Line 25:
 
*Connecting: We'll provide detailed instructions on how to connect to your cluster's head node via secure methods like SSH.
 
*Connecting: We'll provide detailed instructions on how to connect to your cluster's head node via secure methods like SSH.
 
*Job Submission: You'll utilize your chosen scheduler's interface (e.g., Slurm commands) to submit your computational jobs and scripts to the cluster's queue.
 
*Job Submission: You'll utilize your chosen scheduler's interface (e.g., Slurm commands) to submit your computational jobs and scripts to the cluster's queue.
 +
[[File:PyJobHPC.png]]
 
*Monitoring: Tools within AWS ParallelCluster or Azure CycleCloud dashboards will allow you to track job progress, resource usage, and costs.
 
*Monitoring: Tools within AWS ParallelCluster or Azure CycleCloud dashboards will allow you to track job progress, resource usage, and costs.
 
   
 
   

Revision as of 21:06, 24 April 2024

To empower demanding workloads and specialized use cases with tailored compute resources, we leverage both managed HPC platforms on AWS and Azure as well as customizable virtual machines. This approach offers the best of both worlds.

DeepSense AWS ParallelCluster & Azure CycleCloud

High-Performance Computing (HPC) harnesses the collective power of multiple connected computers (called a cluster) to tackle complex problems far beyond the capabilities of a single machine. HPC is essential for Large-scale simulations such as modeling weather patterns, fluid dynamics, or complex engineering systems. For massive data analysis such as finding patterns in vast datasets, accelerating scientific discovery, or driving AI-powered insights. And also training complex AI/ML models such as developing models that require immense computational resources for tasks like natural language processing or computer vision.

Tools for efficient HPC are AWS ParallelCluster and Azure CycleCloud to streamline the management and deployment of HPC solutions.

AWS ParallelCluster

  • Simplified HPC Setup: AWS ParallelCluster handles much of the complexity involved in setting up and managing an HPC cluster on AWS.
  • Flexible Configurations: Choose from a variety of EC2 instance types (CPU, GPU, high-memory, etc.), network options, and storage solutions.
  • Integration: Easily connect your cluster with other AWS services for data processing, storage, and visualization.

Azure CycleCloud

  • Orchestration & Automation: Azure CycleCloud simplifies the creation and management of HPC clusters on Azure, automating scaling and provisioning.
  • Scheduler Support: Select your preferred HPC scheduler (Slurm, PBS Pro, GridEngine, etc.) for efficient job management.
  • Azure Ecosystem: Seamlessly integrate with other Azure services like Azure Blob Storage and Azure Kubernetes Service.

Accessing Your HPC Cluster

AWS & Azure Administrator Setup

  • Account Creation: Your dedicated AWS or Azure administrator will provision the necessary accounts and permissions for our team to set up your HPC environment.
  • Access & MFA: You'll receive secure credentials and instructions for setting up Multi-Factor Authentication (MFA) to enhance security.

Getting Started with HPC

  • Cluster Launch: Our team will configure and launch your cluster using either AWS ParallelCluster or Azure CycleCloud.
  • Connecting: We'll provide detailed instructions on how to connect to your cluster's head node via secure methods like SSH.
  • Job Submission: You'll utilize your chosen scheduler's interface (e.g., Slurm commands) to submit your computational jobs and scripts to the cluster's queue.

PyJobHPC.png

  • Monitoring: Tools within AWS ParallelCluster or Azure CycleCloud dashboards will allow you to track job progress, resource usage, and costs.

Remember: It's crucial to stop or scale down your cluster when not actively in use to avoid unnecessary charges.

Our team has extensive experience with HPC on AWS and Azure. Contact us to explore how these powerful platforms can accelerate your computational workloads!