ParallelPlugin¶
-
class
pytorch_lightning.plugins.training_type.ParallelPlugin(parallel_devices=None, cluster_environment=None)[source]¶ Bases:
pytorch_lightning.plugins.training_type.training_type_plugin.TrainingTypePlugin,abc.ABCPlugin for training with multiple processes in parallel.
-
all_gather(tensor, group=None, sync_grads=False)[source]¶ Perform a all_gather on all processes
- Return type
-
block_backward_sync()[source]¶ Blocks ddp sync gradients behaviour on backwards pass. This is useful for skipping sync when accumulating gradients, reducing communication overhead Returns: context manager with sync behaviour off
-
static
configure_sync_batchnorm(model)[source]¶ Add global batchnorm for a model spread across multiple GPUs and nodes.
Override to synchronize batchnorm between specific process groups instead of the whole world or use a different sync_bn like apex’s version.
- Parameters
model¶ (
LightningModule) – pointer to currentLightningModule.- Return type
- Returns
LightningModule with batchnorm layers synchronized between process groups
-
reduce_boolean_decision(decision)[source]¶ Reduce the early stopping decision across all processes
- Return type
-
property
is_global_zero¶ Whether the current process is the rank zero process not only on the local node, but for all nodes.
- Return type
-
property
lightning_module¶ Returns the pure LightningModule without potential wrappers
-
property
on_gpu¶ Returns whether the current process is done on GPU
-
abstract property
root_device¶ Returns the root device
-