accelerators¶

`Accelerator`	The Accelerator Base Class.
`CPUAccelerator`	Accelerator for CPU devices.
`CUDAAccelerator`	Accelerator for NVIDIA CUDA devices.
`HPUAccelerator`	Accelerator for HPU devices.
`IPUAccelerator`	Accelerator for IPUs.
`TPUAccelerator`	Accelerator for TPU devices.

callbacks¶

`BackboneFinetuning`	Finetune a backbone model based on a learning rate user-defined scheduling.
`BaseFinetuning`	This class implements the base logic for writing your own Finetuning Callback.
`BasePredictionWriter`	Base class to implement how the predictions should be stored.
`Callback`	Abstract base class used to build new callbacks.
`DeviceStatsMonitor`	Automatically monitors and logs device stats during training stage.
`EarlyStopping`	Monitor a metric and stop training when it stops improving.
`GradientAccumulationScheduler`	Change gradient accumulation factor according to scheduling.
`LambdaCallback`	Create a simple callback on the fly using lambda functions.
`LearningRateMonitor`	Automatically monitor and logs learning rate for learning rate schedulers during training.
`ModelCheckpoint`	Save the model periodically by monitoring a quantity.
`ModelPruning`	Model pruning Callback, using PyTorch's prune utilities.
`ModelSummary`	Generates a summary of all layers in a `LightningModule`.
`ProgressBarBase`	The base class for progress bars in Lightning.
`QuantizationAwareTraining`	Quantization allows speeding up inference and decreasing memory requirements by performing computations and storing tensors at lower bitwidths (such as INT8 or FLOAT16) than floating point precision.
`RichModelSummary`	Generates a summary of all layers in a `LightningModule` with rich text formatting.
`RichProgressBar`	Create a progress bar with rich text formatting.
`StochasticWeightAveraging`	Implements the Stochastic Weight Averaging (SWA) Callback to average a model.
`Timer`	The Timer callback tracks the time spent in the training, validation, and test loops and interrupts the Trainer if the given time limit for the training loop is reached.
`TQDMProgressBar`	This is the default progress bar used by Lightning.

core¶

`CheckpointHooks`	Hooks to be used with Checkpointing.
`DataHooks`	Hooks to be used for data related stuff.
`ModelHooks`	Hooks to be used in LightningModule.
`LightningDataModule`	A DataModule standardizes the training, val, test splits, data preparation and transforms.
`LightningModule`
`DeviceDtypeModuleMixin`	Initializes internal Module state, shared by both nn.Module and ScriptModule.
`HyperparametersMixin`
`LightningOptimizer`	This class is used to wrap the user optimizers and handle properly the backward and optimizer_step logic across accelerators, AMP, accumulate_grad_batches.
`ModelIO`

lightninglite¶

LightningLite

Lite accelerates your PyTorch training or inference code with minimal changes required.

loggers¶

`base`
`comet`	Comet Logger
`csv_logs`	CSV logger
`mlflow`	MLflow Logger
`neptune`	Neptune Logger
`tensorboard`	TensorBoard Logger
`wandb`	Weights and Biases Logger

loops¶

Base Classes¶

`DataLoaderLoop`	Base class to loop over all dataloaders.
`Loop`

Training¶

`TrainingBatchLoop`	Runs over a single batch of data.
`TrainingEpochLoop`	Runs over all batches in a dataloader (one epoch).
`FitLoop`	This Loop iterates over the epochs to run the training.
`ManualOptimization`	A special loop implementing what is known in Lightning as Manual Optimization where the optimization happens entirely in the `training_step()` and therefore the user is responsible for back-propagating gradients and making calls to the optimizers.
`OptimizerLoop`	Runs over a sequence of optimizers.

Validation and Testing¶

`EvaluationEpochLoop`	This is the loop performing the evaluation.
`EvaluationLoop`	Loops over all dataloaders for evaluation.

Prediction¶

`PredictionEpochLoop`	Loop performing prediction on arbitrary sequentially used dataloaders.
`PredictionLoop`	Loop to run over dataloaders for prediction.

plugins¶

precision¶

`ApexMixedPrecisionPlugin`	Mixed Precision Plugin based on Nvidia/Apex (https://github.com/NVIDIA/apex)
`DeepSpeedPrecisionPlugin`	Precision plugin for DeepSpeed integration.
`DoublePrecisionPlugin`	Plugin for training with double (`torch.float64`) precision.
`FullyShardedNativeMixedPrecisionPlugin`	Native AMP for Fully Sharded Training.
`FullyShardedNativeNativeMixedPrecisionPlugin`	Native AMP for Fully Sharded Native Training.
`HPUPrecisionPlugin`	Plugin that enables bfloat/half support on HPUs.
`IPUPrecisionPlugin`	Precision plugin for IPU integration.
`MixedPrecisionPlugin`	Base Class for mixed precision.
`NativeMixedPrecisionPlugin`	Plugin for Native Mixed Precision (AMP) training with `torch.autocast`.
`PrecisionPlugin`	Base class for all plugins handling the precision-specific parts of the training.
`ShardedNativeMixedPrecisionPlugin`	Native AMP for Sharded Training.
`TPUBf16PrecisionPlugin`	Plugin that enables bfloats on TPUs.
`TPUPrecisionPlugin`	Precision plugin for TPU integration.

environments¶

`ClusterEnvironment`	Specification of a cluster environment.
`KubeflowEnvironment`	Environment for distributed training using the PyTorchJob operator from Kubeflow
`LightningEnvironment`	The default environment used by Lightning for a single node or free cluster (not managed).
`LSFEnvironment`	An environment for running on clusters managed by the LSF resource manager.
`SLURMEnvironment`	Cluster environment for training on a cluster managed by SLURM.
`TorchElasticEnvironment`	Environment for fault-tolerant and elastic training with torchelastic
`XLAEnvironment`	Cluster environment for training on a TPU Pod with the PyTorch/XLA library.

io¶

`AsyncCheckpointIO`	`AsyncCheckpointIO` enables saving the checkpoints asynchronously in a thread.
`CheckpointIO`	Interface to save/load checkpoints as they are saved through the `Strategy`.
`HPUCheckpointIO`	CheckpointIO to save checkpoints for HPU training strategies.
`TorchCheckpointIO`	CheckpointIO that utilizes `torch.save()` and `torch.load()` to save and load checkpoints respectively, common for most use cases.
`XLACheckpointIO`	CheckpointIO that utilizes `xm.save()` to save checkpoints for TPU training strategies.

others¶

`LayerSync`	Abstract base class for creating plugins that wrap layers of a model with synchronization logic for multiprocessing.
`NativeSyncBatchNorm`	A plugin that wraps all batch normalization layers of a model with synchronization logic for multiprocessing.

profiler¶

`AdvancedProfiler`	This profiler uses Python's cProfiler to record more detailed information about time spent in each function call recorded during a given action.
`PassThroughProfiler`	This class should be used when you don't want the (small) overhead of profiling.
`Profiler`	If you wish to write a custom profiler, you should inherit from this class.
`PyTorchProfiler`	This profiler uses PyTorch's Autograd Profiler and lets you inspect the cost of.
`SimpleProfiler`	This profiler simply records the duration of actions (in seconds) and reports the mean duration of each action and the total time spent over the entire training run.
`XLAProfiler`	XLA Profiler will help you debug and optimize training workload performance for your models using Cloud TPU performance tools.

trainer¶

Trainer

Customize every aspect of training via flags.

strategies¶

`BaguaStrategy`	Strategy for training using the Bagua library, with advanced distributed training algorithms and system optimizations.
`HivemindStrategy`	Provides capabilities to train using the Hivemind Library, training collaboratively across the internet with unreliable machines.
`DDPFullyShardedStrategy`	Plugin for Fully Sharded Data Parallel provided by FairScale.
`DDPShardedStrategy`	Optimizer and gradient sharded training provided by FairScale.
`DDPSpawnShardedStrategy`	Optimizer sharded training provided by FairScale.
`DDPSpawnStrategy`	Spawns processes using the `torch.multiprocessing.spawn()` method and joins processes after training finishes.
`DDPStrategy`	Strategy for multi-process single-device training on one or multiple nodes.
`DataParallelStrategy`	Implements data-parallel training in a single process, i.e., the model gets replicated to each device and each gets a split of the data.
`DeepSpeedStrategy`	Provides capabilities to run training using the DeepSpeed library, with training optimizations for large billion parameter models.
`HorovodStrategy`	Plugin for Horovod distributed training integration.
`HPUParallelStrategy`	Strategy for distributed training on multiple HPU devices.
`IPUStrategy`	Plugin for training on IPU devices.
`ParallelStrategy`	Plugin for training with multiple processes in parallel.
`SingleDeviceStrategy`	Strategy that handles communication on a single device.
`SingleHPUStrategy`	Strategy for training on single HPU device.
`SingleTPUStrategy`	Strategy for training on a single TPU device.
`Strategy`	Base class for all strategies that change the behaviour of the training, validation and test- loop.
`TPUSpawnStrategy`	Strategy for training multiple TPU devices using the `torch_xla.distributed.xla_multiprocessing.spawn()` method.

tuner¶

Tuner

Tuner class to tune your model.

utilities¶

`apply_func`	Utilities used for collections.
`argparse`	Utilities for Argument Parsing within Lightning Components.
`cli`	Deprecated utilities for LightningCLI.
`cloud_io`	Utilities related to data saving/loading.
`deepspeed`	Utilities that can be used with Deepspeed.
`distributed`	Utilities that can be used with distributed training.
`finite_checks`	Helper functions to detect NaN/Inf values.
`memory`	Utilities related to memory.
`model_summary`
`optimizer`
`parsing`	Utilities used for parameter parsing.
`rank_zero`	Utilities that can be used for calling functions on a particular rank.
`seed`	Utilities to help with reproducibility of models.
`warnings`	Warning-related utilities.