DDPPlugin¶
-
class
pytorch_lightning.plugins.training_type.
DDPPlugin
(parallel_devices=None, num_nodes=1, cluster_environment=None, sync_batchnorm=False, ddp_comm_state=None, ddp_comm_hook=None, ddp_comm_wrapper=None, **kwargs)[source]¶ Bases:
pytorch_lightning.plugins.training_type.parallel.ParallelPlugin
Plugin for multi-process single-device training on one or multiple nodes.
The master process in each node spawns N-1 child processes via
subprocess.Popen()
, where N is the number of devices (e.g. GPU) per node. It is very similar to howtorch.distributed.launch
launches processes.-
barrier
(*args, **kwargs)[source]¶ Forces all possibly joined processes to wait for each other
- Return type
-
post_dispatch
()[source]¶ Hook to do something after the training/evaluation/prediction finishes.
- Return type
-
pre_backward
(closure_loss, should_accumulate, optimizer, opt_idx)[source]¶ Run before precision plugin executes backward
-
reduce
(tensor, group=None, reduce_op='mean')[source]¶ Reduces a tensor from several distributed processes to one aggregated tensor.
- Parameters
- Returns
reduced value, except when the input was not a tensor the output remains is unchanged
-
setup_environment
()[source]¶ Setup any processes or distributed connections. This is called before the LightningModule/DataModule setup hook which allows the user to access the accelerator environment before setup is complete.
-
property
root_device
¶ Returns the root device
-