Shortcuts

LightningOptimizer

class pytorch_lightning.core.optimizer.LightningOptimizer(optimizer)[source]

Bases: object

This class is used to wrap the user optimizers and handle properly the backward and optimizer_step logic across accelerators, AMP, accumulate_grad_batches.

step(closure=None, **kwargs)[source]

Performs a single optimization step (parameter update).

Parameters
  • closure (Optional[Callable[[], Any]]) – An optional optimizer closure.

  • kwargs (Any) – Any additional arguments to the optimizer.step() call.

Return type

Any

Returns

The output from the step call, which is generally the output of the closure execution.

Example:

# Scenario for a GAN using manual optimization
def training_step(...):
    opt_gen, opt_dis = self.optimizers()

    ...

    # compute generator loss
    loss_gen = self.compute_generator_loss(...)
    # zero_grad needs to be called before backward
    opt_gen.zero_grad()
    self.manual_backward(loss_gen)
    opt_gen.step()

    # compute discriminator loss
    loss_dis = self.compute_discriminator_loss(...)

    # zero_grad needs to be called before backward
    opt_dis.zero_grad()
    self.manual_backward(loss_dis)
    opt_dis.step()


# A more advanced example
def training_step(self, batch, batch_idx, ...):
    opt_gen, opt_dis = self.optimizers()

    ...
    accumulated_grad_batches = batch_idx % 2 == 0

    # compute generator loss
    def closure_gen():
        loss_gen = self.compute_generator_loss(...)
        self.manual_backward(loss_gen)
        if accumulated_grad_batches:
            opt_gen.zero_grad()

    with opt_gen.toggle_model(sync_grad=accumulated_grad_batches):
        opt_gen.step(closure=closure_gen)

    def closure_dis():
        loss_dis = self.compute_discriminator_loss(...)
        self.manual_backward(loss_dis)
        if accumulated_grad_batches:
            opt_dis.zero_grad()

    with opt_dis.toggle_model(sync_grad=accumulated_grad_batches):
        opt_dis.step(closure=closure_dis)
toggle_model(sync_grad=True)[source]

This function is just a helper for advanced users.

Considering the current optimizer as A and all other optimizers as B. Toggling means all parameters from B exclusive to A will have requires_grad set to False.

When performing gradient accumulation, there is no need to perform grad synchronization during the accumulation phase. Setting sync_grad to False will block this synchronization and improve performance.

Return type

Generator[None, None, None]

Read the Docs v: latest
Versions
latest
stable
1.6.4
1.6.3
1.6.2
1.6.1
1.6.0
1.5.10
1.5.9
1.5.8
1.5.7
1.5.6
1.5.5
1.5.4
1.5.3
1.5.2
1.5.1
1.5.0
1.4.9
1.4.8
1.4.7
1.4.6
1.4.5
1.4.4
1.4.3
1.4.2
1.4.1
1.4.0
1.3.8
1.3.7
1.3.6
1.3.5
1.3.4
1.3.3
1.3.2
1.3.1
1.3.0
1.2.10
1.2.8
1.2.7
1.2.6
1.2.5
1.2.4
1.2.3
1.2.2
1.2.1
1.2.0
1.1.8
1.1.7
1.1.6
1.1.5
1.1.4
1.1.3
1.1.2
1.1.1
1.1.0
1.0.8
1.0.7
1.0.6
1.0.5
1.0.4
1.0.3
1.0.2
1.0.1
1.0.0
0.10.0
0.9.0
0.8.5
0.8.4
0.8.3
0.8.2
0.8.1
0.8.0
0.7.6
0.7.5
0.7.4
0.7.3
0.7.2
0.7.1
0.7.0
0.6.0
0.5.3
0.4.9
future-structure
Downloads
html
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.