DDPPlugin(parallel_devices=None, num_nodes=None, cluster_environment=None, sync_batchnorm=None, ddp_comm_state=None, ddp_comm_hook=None, ddp_comm_wrapper=None, **kwargs)¶
Plugin for multi-process single-device training on one or multiple nodes.
The master process in each node spawns N-1 child processes via
subprocess.Popen(), where N is the number of devices (e.g. GPU) per node. It is very similar to how
Forces all possibly joined processes to wait for each other
Moves the model to the correct device
Hook to do something after the training/evaluation/prediction finishes.
- Return type
pre_backward(closure_loss, should_accumulate, optimizer, opt_idx)¶
Run before precision plugin executes backward
Hook to do something before the training/evaluation/prediction starts.
reduce(tensor, group=None, reduce_op='mean')¶
Reduces a tensor from several distributed processes to one aggregated tensor.
reduced value, except when the input was not a tensor the output remains is unchanged
Setup any processes or distributed connections. This is called before the LightningModule/DataModule setup hook which allows the user to access the accelerator environment before setup is complete.
Returns the root device