Shortcuts

API References

Accelerator API

Accelerator

The Accelerator Base Class.

CPUAccelerator

Accelerator for CPU devices.

GPUAccelerator

Accelerator for GPU devices.

TPUAccelerator

Accelerator for TPU devices.

Core API

datamodule

LightningDataModule for loading DataLoaders with ease.

decorators

Decorator for LightningModule methods.

hooks

Various hooks to be used in the Lightning code.

lightning

The LightningModule - an nn.Module with many additional features.

Callbacks API

base

Abstract base class used to build new callbacks.

early_stopping

Early Stopping

gpu_stats_monitor

GPU Stats Monitor

gradient_accumulation_scheduler

Gradient Accumulator

lr_monitor

Learning Rate Monitor

model_checkpoint

Model Checkpointing

progress

Progress Bars

Loggers API

base

Abstract base class used to build new loggers.

comet

Comet Logger

csv_logs

CSV logger

mlflow

MLflow Logger

neptune

Neptune Logger

tensorboard

TensorBoard Logger

test_tube

Test Tube Logger

wandb

Weights and Biases Logger

Plugins API

Training Type Plugins

TrainingTypePlugin

Base class for all training type plugins that change the behaviour of the training, validation and test-loop.

SingleDevicePlugin

Plugin that handles communication on a single device.

ParallelPlugin

Plugin for training with multiple processes in parallel.

DataParallelPlugin

Implements data-parallel training in a single process, i.e., the model gets replicated to each device and each gets a split of the data.

DDPPlugin

Plugin for multi-process single-device training on one or multiple nodes.

DDP2Plugin

DDP2 behaves like DP in one node, but synchronization across nodes behaves like in DDP.

DDPShardedPlugin

Optimizer and gradient sharded training provided by FairScale.

DDPSpawnShardedPlugin

Optimizer sharded training provided by FairScale.

DDPSpawnPlugin

Spawns processes using the torch.multiprocessing.spawn() method and joins processes after training finishes.

DeepSpeedPlugin

Provides capabilities to run training using the DeepSpeed library, with training optimizations for large billion parameter models.

HorovodPlugin

Plugin for Horovod distributed training integration.

SingleTPUPlugin

Plugin for training on a single TPU device.

TPUSpawnPlugin

Plugin for training multiple TPU devices using the torch.multiprocessing.spawn() method.

Precision Plugins

PrecisionPlugin

Base class for all plugins handling the precision-specific parts of the training.

NativeMixedPrecisionPlugin

Plugin for native mixed precision training with torch.cuda.amp.

ShardedNativeMixedPrecisionPlugin

Mixed Precision for Sharded Training

ApexMixedPrecisionPlugin

Mixed Precision Plugin based on Nvidia/Apex (https://github.com/NVIDIA/apex)

DeepSpeedPrecisionPlugin

Precision plugin for DeepSpeed integration.

TPUHalfPrecisionPlugin

Plugin that enables bfloats on TPUs

DoublePrecisionPlugin

Plugin for training with double (torch.float64) precision.

Cluster Environments

ClusterEnvironment

Specification of a cluster environment.

LightningEnvironment

The default environment used by Lightning for a single node or free cluster (not managed).

LSFEnvironment

An environment for running on clusters managed by the LSF resource manager.

TorchElasticEnvironment

Environment for fault-tolerant and elastic training with torchelastic

KubeflowEnvironment

Environment for distributed training using the PyTorchJob operator from Kubeflow

SLURMEnvironment

Cluster environment for training on a cluster managed by SLURM.

Profiler API

AbstractProfiler

Specification of a profiler.

AdvancedProfiler

This profiler uses Python’s cProfiler to record more detailed information about time spent in each function call recorded during a given action.

BaseProfiler

If you wish to write a custom profiler, you should inherit from this class.

PassThroughProfiler

This class should be used when you don’t want the (small) overhead of profiling.

PyTorchProfiler

This profiler uses PyTorch’s Autograd Profiler and lets you inspect the cost of different operators inside your model - both on the CPU and GPU

SimpleProfiler

This profiler simply records the duration of actions (in seconds) and reports the mean duration of each action and the total time spent over the entire training run.

Trainer API

trainer

Trainer to automate the training.

Tuner API

Tuner

Tuner class to tune your model

Utilities API

cli

argparse

seed

Helper functions to help with reproducibility of models.