Shortcuts

Accelerator

class pytorch_lightning.accelerators.Accelerator(precision_plugin, training_type_plugin)[source]

Bases: object

The Accelerator Base Class. An Accelerator is meant to deal with one type of Hardware.

Currently there are accelerators for:

  • CPU

  • GPU

  • TPU

Each Accelerator gets two plugins upon initialization: One to handle differences from the training routine and one to handle different precisions.

Parameters
  • precision_plugin (PrecisionPlugin) – the plugin to handle precision-specific parts

  • training_type_plugin (TrainingTypePlugin) – the plugin to handle different training routines

all_gather(tensor, group=None, sync_grads=False)[source]

Function to gather a tensor from several distributed processes.

Parameters
  • tensor (Tensor) – tensor of shape (batch, …)

  • group (Optional[Any]) – the process group to gather results from. Defaults to all processes (world)

  • sync_grads (bool) – flag that allows users to synchronize gradients for all_gather op

Return type

Tensor

Returns

A tensor of shape (world_size, batch, …)

backward(closure_loss, optimizer, optimizer_idx, should_accumulate, *args, **kwargs)[source]

Forwards backward-calls to the precision plugin.

Parameters
  • closure_loss (Tensor) – a tensor holding the loss value to backpropagate

  • should_accumulate (bool) – whether to accumulate gradients

Return type

Tensor

batch_to_device(batch, device=None)[source]

Moves the batch to the correct device. The returned batch is of the same type as the input batch, just having all tensors on the correct device.

Parameters
  • batch (Any) – The batch of samples to move to the correct device

  • device (Optional[device]) – The target device

Return type

Any

broadcast(obj, src=0)[source]

Broadcasts an object to all processes, such that the src object is broadcast to all other ranks if needed.

Parameters
  • obj (object) – Object to broadcast to all process, usually a tensor or collection of tensors.

  • src (int) – The source rank of which the object will be broadcast from

Return type

object

clip_gradients(optimizer, clip_val, gradient_clip_algorithm=<GradClipAlgorithmType.NORM: 'norm'>)[source]

clips all the optimizer parameters to the given value

Return type

None

connect(model)[source]

Transfers ownership of the model to this plugin

Return type

None

connect_precision_plugin(plugin)[source]

Attaches the precision plugin to the accelerator

Return type

None

connect_training_type_plugin(plugin, model)[source]

Attaches the training type plugin to the accelerator. Also transfers ownership of the model to this plugin

Return type

None

dispatch(trainer)[source]

Hook to do something before the training/evaluation/prediction starts.

Return type

None

model_sharded_context()[source]

Provide hook to create modules in a distributed aware context. This is useful for when we’d like to shard the model instantly - useful for extremely large models. Can save memory and initialization time.

Return type

Generator[None, None, None]

Returns

Model parallel context.

on_train_end()[source]

Hook to do something at the end of the training

Return type

None

on_train_epoch_end()[source]

Hook to do something on the end of an training epoch.

Return type

None

on_train_start()[source]

Hook to do something upon the training start

Return type

None

optimizer_state(optimizer)[source]

Returns state of an optimizer. Allows for syncing/collating optimizer state from processes in custom plugins.

Return type

Dict[str, Tensor]

optimizer_step(optimizer, opt_idx, lambda_closure, **kwargs)[source]

performs the actual optimizer step.

Parameters
  • optimizer (Optimizer) – the optimizer performing the step

  • opt_idx (int) – index of the current optimizer

  • lambda_closure (Callable) – closure calculating the loss value

Return type

None

optimizer_zero_grad(current_epoch, batch_idx, optimizer, opt_idx)[source]

Zeros all model parameter’s gradients

Return type

None

post_dispatch(trainer)[source]

Hook to do something after the training/evaluation/prediction starts.

Return type

None

pre_dispatch(trainer)[source]

Hook to do something before the training/evaluation/prediction starts.

Return type

None

predict_step(args)[source]

The actual predict step.

Parameters

args (List[Union[Any, int]]) –

the arguments for the models predict step. Can consist of the following:

  • batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.

  • batch_idx (int): The index of this batch.

  • dataloader_idx (int): The index of the dataloader that produced this batch (only if multiple predict dataloaders used).

Return type

Union[Tensor, Dict[str, Any]]

process_dataloader(dataloader)[source]

Wraps the dataloader if necessary

Parameters

dataloader (Union[Iterable, DataLoader]) – iterable. Ideally of type: torch.utils.data.DataLoader

Return type

Union[Iterable, DataLoader]

save_checkpoint(checkpoint, filepath)[source]

Save model/training states as a checkpoint file through state-dump and file-write.

Parameters
  • checkpoint (Dict[str, Any]) – dict containing model and trainer state

  • filepath (str) – write-target file’s path

Return type

None

setup(trainer, model)[source]

Setup plugins for the trainer fit and creates optimizers.

Parameters
Return type

None

setup_environment()[source]

Setup any processes or distributed connections. This is called before the LightningModule/DataModule setup hook which allows the user to access the accelerator environment before setup is complete.

Return type

None

setup_optimizers(trainer)[source]

Creates optimizers and schedulers

Parameters

trainer (Trainer) – the Trainer, these optimizers should be connected to

Return type

None

setup_precision_plugin(plugin)[source]

Attaches the precision plugin to the accelerator

Return type

None

setup_training_type_plugin(plugin, model)[source]

Attaches the training type plugin to the accelerator.

Return type

None

teardown()[source]

This method is called to teardown the training process. It is the right place to release memory and free other ressources.

By default we add a barrier here to synchronize processes before returning control back to the caller.

Return type

None

test_step(args)[source]

The actual test step.

Parameters

args (List[Union[Any, int]]) –

the arguments for the models test step. Can consist of the following:

  • batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.

  • batch_idx (int): The index of this batch.

  • dataloader_idx (int): The index of the dataloader that produced this batch (only if multiple test dataloaders used).

Return type

Union[Tensor, Dict[str, Any], None]

test_step_end(output)[source]

A hook to do something at the end of the test step

Parameters

output (Union[Tensor, Dict[str, Any], None]) – the output of the test step

Return type

Union[Tensor, Dict[str, Any], None]

to_device(batch)[source]

Pushes the batch to the root device

Return type

Any

training_step(args)[source]

The actual training step.

Parameters

args (List[Union[Any, int]]) –

the arguments for the models training step. Can consist of the following:

  • batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.

  • batch_idx (int): Integer displaying index of this batch

  • optimizer_idx (int): When using multiple optimizers, this argument will also be present.

  • hiddens(Tensor): Passed in if truncated_bptt_steps > 0.

Return type

Union[Tensor, Dict[str, Any]]

training_step_end(output)[source]

A hook to do something at the end of the training step

Parameters

output (Union[Tensor, Dict[str, Any]]) – the output of the training step

Return type

Union[Tensor, Dict[str, Any]]

validation_step(args)[source]

The actual validation step.

Parameters

args (List[Union[Any, int]]) –

the arguments for the models validation step. Can consist of the following:

  • batch (Tensor | (Tensor, …) | [Tensor, …]): The output of your DataLoader. A tensor, tuple or list.

  • batch_idx (int): The index of this batch

  • dataloader_idx (int): The index of the dataloader that produced this batch (only if multiple val dataloaders used)

Return type

Union[Tensor, Dict[str, Any], None]

validation_step_end(output)[source]

A hook to do something at the end of the validation step

Parameters

output (Union[Tensor, Dict[str, Any], None]) – the output of the validation step

Return type

Union[Tensor, Dict[str, Any], None]

property call_configure_sharded_model_hook

Allow model parallel hook to be called in suitable environments determined by the training type plugin. This is useful for when we want to shard the model once within fit.

Return type

bool

Returns

True if we want to call the model parallel setup hook.

property lightning_module

Returns the pure LightningModule. To get the potentially wrapped model use Accelerator.model

Return type

LightningModule

property model

Returns the model. This can also be a wrapped LightningModule. For retrieving the pure LightningModule use Accelerator.lightning_module

Return type

Module

property results

The results of the last run will be cached within the training type plugin. In distributed training, we make sure to transfer the results to the appropriate master process.

Return type

Any

property setup_optimizers_in_pre_dispatch

Override to delay setting optimizers and schedulers till after dispatch. This is useful when the TrainingTypePlugin requires operating on the wrapped accelerator model. However this may break certain precision plugins such as APEX which require optimizers to be set.

Return type

bool

Returns

If True, delay setup optimizers until pre_dispatch, else call within setup.