Learning Rate Finder¶
For training deep neural networks, selecting a good learning rate is essential for both better performance and faster convergence. Even optimizers such as Adam that are self-adjusting the learning rate can benefit from more optimal choices.
To reduce the amount of guesswork concerning choosing a good initial learning rate, a learning rate finder can be used. As described in this paper a learning rate finder does a small run where the learning rate is increased after each processed batch and the corresponding loss is logged. The result of this is a lr vs. loss plot that can be used as guidance for choosing a optimal initial lr.
For the moment, this feature only works with models having a single optimizer. LR Finder support for DDP and any of its variations is not implemented yet. It is coming soon.
Using Lightning’s built-in LR finder¶
To enable the learning rate finder, your lightning module needs to have a
Trainer(auto_lr_find=True) during trainer construction,
and then call
trainer.tune(model) to run the LR finder. The suggested
will be written to the console and will be automatically set to your lightning module,
which can be accessed via
class LitModel(LightningModule): def __init__(self, learning_rate): self.learning_rate = learning_rate def configure_optimizers(self): return Adam(self.parameters(), lr=(self.lr or self.learning_rate)) model = LitModel() # finds learning rate automatically # sets hparams.lr or hparams.learning_rate to that learning rate trainer = Trainer(auto_lr_find=True) trainer.tune(model)
If your model is using an arbitrary value instead of
self.learning_rate, set that value as
model = LitModel() # to set to your own hparams.my_value trainer = Trainer(auto_lr_find="my_value") trainer.tune(model)
You can also inspect the results of the learning rate finder or just play around
with the parameters of the algorithm. This can be done by invoking the
lr_find() method. A typical example of this would look like:
model = MyModelClass(hparams) trainer = Trainer() # Run learning rate finder lr_finder = trainer.tuner.lr_find(model) # Results can be found in lr_finder.results # Plot with fig = lr_finder.plot(suggest=True) fig.show() # Pick point based on plot, or get suggestion new_lr = lr_finder.suggestion() # update hparams of the model model.hparams.lr = new_lr # Fit model trainer.fit(model)
The figure produced by
lr_finder.plot() should look something like the figure
below. It is recommended to not pick the learning rate that achieves the lowest
loss, but instead something in the middle of the sharpest downward slope (red point).
This is the point returned py
The parameters of the algorithm can be seen below.