Table of Contents

Shortcuts

Run on a multi-node cluster¶

Run single or multi-node on Lightning Studios

The easiest way to scale models in the cloud. No infrastructure setup required.

basic

Run on an on-prem cluster

Learn to train models on a general compute cluster.

intermediate

Run with Torch Distributed

Run models on a cluster with torch distributed.

intermediate

Run on a SLURM cluster

Run models on a SLURM-managed cluster

intermediate

Integrate your own cluster

Learn how to integrate your own cluster

expert