Accelerator¶

class pytorch_lightning.accelerators.Accelerator(precision_plugin, training_type_plugin)[source]¶

Bases: object

The Accelerator Base Class. An Accelerator is meant to deal with one type of Hardware.

Currently there are accelerators for:

CPU
GPU
TPU

Each Accelerator gets two plugins upon initialization: One to handle differences from the training routine and one to handle different precisions.

Parameters

precision_plugin¶ (PrecisionPlugin) – the plugin to handle precision-specific parts
training_type_plugin¶ (TrainingTypePlugin) – the plugin to handle different training routines

all_gather(tensor, group=None, sync_grads=False)[source]¶

Function to gather a tensor from several distributed processes.

Parameters

tensor¶ (Tensor) – tensor of shape (batch, …)
group¶ (Optional[Any]) – the process group to gather results from. Defaults to all processes (world)
sync_grads¶ (bool) – flag that allows users to synchronize gradients for all_gather op

Return type

Tensor

Returns

A tensor of shape (world_size, batch, …)

backward(closure_loss, optimizer, *args, **kwargs)[source]¶

Forwards backward-calls to the precision plugin.

Parameters: closure_loss¶ (Tensor) – a tensor holding the loss value to backpropagate
Return type: Tensor

batch_to_device(batch, device=None, dataloader_idx=None)[source]¶

Moves the batch to the correct device. The returned batch is of the same type as the input batch, just having all tensors on the correct device.

Parameters

batch¶ (Any) – The batch of samples to move to the correct device
device¶ (Optional[device]) – The target device
dataloader_idx¶ (Optional[int]) – The index of the dataloader to which the batch belongs.

Return type

Any

broadcast(obj, src=0)[source]¶

Broadcasts an object to all processes, such that the src object is broadcast to all other ranks if needed.

Parameters

obj¶ (object) – Object to broadcast to all process, usually a tensor or collection of tensors.
src¶ (int) – The source rank of which the object will be broadcast from

Return type

object

clip_gradients(optimizer, clip_val, gradient_clip_algorithm=<GradClipAlgorithmType.NORM: 'norm'>)[source]¶

clips all the optimizer parameters to the given value

Return type: None

connect(model)[source]¶

Transfers ownership of the model to this plugin

Return type: None

connect_precision_plugin(plugin)[source]¶

Attaches the precision plugin to the accelerator

Return type: None

connect_training_type_plugin(plugin, model)[source]¶

Attaches the training type plugin to the accelerator. Also transfers ownership of the model to this plugin

Return type: None

dispatch(trainer)[source]¶

Hook to do something before the training/evaluation/prediction starts.

Return type: None

lightning_module_state_dict()[source]¶

Returns state of model. Allows for syncing/collating model state from processes in custom plugins.

Return type: Dict[str, Union[Any, Tensor]]

model_sharded_context()[source]¶

Provide hook to create modules in a distributed aware context. This is useful for when we’d like to shard the model instantly - useful for extremely large models. Can save memory and initialization time.

Return type: Generator[None, None, None]
Returns: Model parallel context.

on_predict_end()[source]¶

Called when predict ends.

Return type: None

on_predict_start()[source]¶

Called when predict begins.

Return type: None

on_reset_predict_dataloader(dataloader)[source]¶

Called before resetting the predict dataloader.

Return type: Union[Iterable, DataLoader]

on_reset_test_dataloader(dataloader)[source]¶

Called before resetting the test dataloader.

Return type: Union[Iterable, DataLoader]

on_reset_train_dataloader(dataloader)[source]¶

Called before resetting the train dataloader.

Return type: Union[Iterable, DataLoader]

on_reset_val_dataloader(dataloader)[source]¶

Called before resetting the val dataloader.

Return type: Union[Iterable, DataLoader]

on_test_end()[source]¶

Called when test end.

Return type: None

on_test_start()[source]¶

Called when test begins.

Return type: None

on_train_batch_start(batch, batch_idx, dataloader_idx)[source]¶

Called in the training loop before anything happens for that batch.

Return type: None

on_train_end()[source]¶

Called when train ends.

Return type: None

on_train_epoch_end()[source]¶

Hook to do something on the end of an training epoch.

Return type: None

on_train_start()[source]¶

Called when train begins.

Return type: None

on_validation_end()[source]¶

Called when validation ends.

Return type: None

on_validation_start()[source]¶

Called when validation begins.

Return type: None

optimizer_state(optimizer)[source]¶

Returns state of an optimizer. Allows for syncing/collating optimizer state from processes in custom plugins.

Return type: Dict[str, Tensor]

optimizer_step(optimizer, opt_idx, lambda_closure, **kwargs)[source]¶

performs the actual optimizer step.

Parameters

optimizer¶ (Optimizer) – the optimizer performing the step
opt_idx¶ (int) – index of the current optimizer
lambda_closure¶ (Callable) – closure calculating the loss value

Return type

None

optimizer_zero_grad(current_epoch, batch_idx, optimizer, opt_idx)[source]¶

Zeros all model parameter’s gradients

Return type: None

post_dispatch(trainer)[source]¶

Hook to do something after the training/evaluation/prediction starts.

Return type: None

pre_dispatch(trainer)[source]¶

Hook to do something before the training/evaluation/prediction starts.

Return type: None

predict_step(step_kwargs)[source]¶

The actual predict step.

Parameters

step_kwargs¶ (Dict[str, Union[Any, int]]) –

the arguments for the models predict step. Can consist of the following:

batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.
batch_idx (int): The index of this batch.
dataloader_idx (int): The index of the dataloader that produced this batch (only if multiple predict dataloaders used).

Return type

Union[Tensor, Dict[str, Any]]

process_dataloader(dataloader)[source]¶

Wraps the dataloader if necessary

Parameters: dataloader¶ (Union[Iterable, DataLoader]) – iterable. Ideally of type: torch.utils.data.DataLoader
Return type: Union[Iterable, DataLoader]

save_checkpoint(checkpoint, filepath)[source]¶

Save model/training states as a checkpoint file through state-dump and file-write.

Parameters

checkpoint¶ (Dict[str, Any]) – dict containing model and trainer state
filepath¶ (str) – write-target file’s path

Return type

None

setup(trainer, model)[source]¶

Setup plugins for the trainer fit and creates optimizers.

Parameters

trainer¶ (Trainer) – the trainer instance
model¶ (LightningModule) – the LightningModule

Return type

None

setup_environment()[source]¶

Setup any processes or distributed connections. This is called before the LightningModule/DataModule setup hook which allows the user to access the accelerator environment before setup is complete.

Return type: None

setup_optimizers(trainer)[source]¶

Creates optimizers and schedulers

Parameters: trainer¶ (Trainer) – the Trainer, these optimizers should be connected to
Return type: None

setup_precision_plugin()[source]¶

Attaches the precision plugin to the accelerator

Return type: None

setup_training_type_plugin(model)[source]¶

Attaches the training type plugin to the accelerator.

Return type: None

teardown()[source]¶

This method is called to teardown the training process. It is the right place to release memory and free other resources.

Return type: None

test_step(step_kwargs)[source]¶

The actual test step.

Parameters

step_kwargs¶ (Dict[str, Union[Any, int]]) –

the arguments for the models test step. Can consist of the following:

batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.
batch_idx (int): The index of this batch.
dataloader_idx (int): The index of the dataloader that produced this batch (only if multiple test dataloaders used).

Return type

Union[Tensor, Dict[str, Any], None]

test_step_end(output)[source]¶

A hook to do something at the end of the test step

Parameters: output¶ (Union[Tensor, Dict[str, Any], None]) – the output of the test step
Return type: Union[Tensor, Dict[str, Any], None]

training_step(step_kwargs)[source]¶

The actual training step.

Parameters

step_kwargs¶ (Dict[str, Union[Any, int]]) –

the arguments for the models training step. Can consist of the following:

batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.
batch_idx (int): Integer displaying index of this batch
optimizer_idx (int): When using multiple optimizers, this argument will also be present.
hiddens(Tensor): Passed in if truncated_bptt_steps > 0.

Return type

Union[Tensor, Dict[str, Any]]

training_step_end(output)[source]¶

A hook to do something at the end of the training step

Parameters: output¶ (Union[Tensor, Dict[str, Any]]) – the output of the training step
Return type: Union[Tensor, Dict[str, Any]]

validation_step(step_kwargs)[source]¶

The actual validation step.

Parameters

step_kwargs¶ (Dict[str, Union[Any, int]]) –

the arguments for the models validation step. Can consist of the following:

batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.
batch_idx (int): The index of this batch
dataloader_idx (int): The index of the dataloader that produced this batch (only if multiple val dataloaders used)

Return type

Union[Tensor, Dict[str, Any], None]

validation_step_end(output)[source]¶

A hook to do something at the end of the validation step

Parameters: output¶ (Union[Tensor, Dict[str, Any], None]) – the output of the validation step
Return type: Union[Tensor, Dict[str, Any], None]

property call_configure_sharded_model_hook: bool¶

Allow model parallel hook to be called in suitable environments determined by the training type plugin. This is useful for when we want to shard the model once within fit.

Return type: bool
Returns: True if we want to call the model parallel setup hook.

property lightning_module: pytorch_lightning.core.lightning.LightningModule¶

Returns the pure LightningModule. To get the potentially wrapped model use Accelerator.model

Return type: LightningModule

property model: torch.nn.Module¶

Returns the model. This can also be a wrapped LightningModule. For retrieving the pure LightningModule use Accelerator.lightning_module

Return type: Module

property results: Any¶

The results of the last run will be cached within the training type plugin. In distributed training, we make sure to transfer the results to the appropriate master process.

Return type: Any

property root_device: torch.device¶

Returns the root device

Return type: device

property setup_optimizers_in_pre_dispatch: bool¶

Override to delay setting optimizers and schedulers till after dispatch. This is useful when the TrainingTypePlugin requires operating on the wrapped accelerator model. However this may break certain precision plugins such as APEX which require optimizers to be set.

Return type: bool
Returns: If True, delay setup optimizers until pre_dispatch, else call within setup.