DDP2Plugin(parallel_devices=None, num_nodes=None, cluster_environment=None, sync_batchnorm=None, ddp_comm_state=None, ddp_comm_hook=None, ddp_comm_wrapper=None, **kwargs)¶
DDP2 behaves like DP in one node, but synchronization across nodes behaves like in DDP.
Moves the model to the correct device
reduce(tensor, *args, **kwargs)¶
Reduces a tensor from all processes to one aggregated tensor. In DDP2, the reduction here is only across local devices within the node.
Called by the accelerator to finish setup.
Returns the root device