Result¶
Lightning has two results objects TrainResult and EvalResult.
Use these to control:
When to log (each step and/or epoch aggregate).
Where to log (progress bar or a logger).
How to sync across accelerators.
Training loop example¶
Return a TrainResult from the Training loop.
def training_step(self, batch_subset, batch_idx):
loss = ...
result = pl.TrainResult(minimize=loss)
result.log('train_loss', loss, prog_bar=True)
return result
If you’d like to do something special with the outputs other than logging, implement __epoch_end.
def training_step(self, batch, batch_idx):
result = pl.TrainResult(loss)
result.some_prediction = some_prediction
return result
def training_epoch_end(self, training_step_output_result):
all_train_predictions = training_step_output_result.some_prediction
training_step_output_result.some_new_prediction = some_new_prediction
return training_step_output_result
Validation/Test loop example¶
Return a EvalResult object from a validation/test loop
def validation_step(self, batch, batch_idx):
some_metric = ...
result = pl.EvalResult(checkpoint_on=some_metric)
result.log('some_metric', some_metric, prog_bar=True)
return result
If you’d like to do something special with the outputs other than logging, implement __epoch_end.
def validation_step(self, batch, batch_idx):
result = pl.EvalResult(checkpoint_on=some_metric)
result.a_prediction = some_prediction
return result
def validation_epoch_end(self, validation_step_output_result):
all_validation_step_predictions = validation_step_output_result.a_prediction
# do something with the predictions from all validation_steps
return validation_step_output_result
With the equivalent using the EvalResult syntax
TrainResult¶
The TrainResult basic usage is this:
minimize¶
def training_step(...):
return TrainResult(some_metric)
checkpoint/early_stop¶
If you are only using a training loop (no val), you can also specify what to monitor for checkpointing or early stopping:
def training_step(...):
return TrainResult(some_metric, checkpoint_on=metric_a, early_stop_on=metric_b)
In the manual loop, checkpoint and early stop is based only on the loss returned. With the TrainResult you can change it every batch if you want, or even monitor different metrics for each purpose.
# early stop + checkpoint can only use the `loss` when done manually via dictionaries
def training_step(...):
return loss
def training_step(...):
return {'loss' loss}
logging¶
The main benefit of the TrainResult is automatic logging at whatever level you want.
result = TrainResult(loss)
result.log('train_loss', loss)
# equivalent
result.log('train_loss', loss, on_step=True, on_epoch=False, logger=True, prog_bar=False, reduce_fx=torch.mean)
By default, any log calls will log only that step’s metrics to the logger. To change when and where to log update the defaults as needed.
Change where to log:
# to logger only (default)
result.log('train_loss', loss)
# logger + progress bar
result.log('train_loss', loss, prog_bar=True)
# progress bar only
result.log('train_loss', loss, prog_bar=True, logger=False)
Sometimes you may also want to get epoch level statistics:
# loss at this step
result.log('train_loss', loss)
# loss for the epoch
result.log('train_loss', loss, on_step=False, on_epoch=True)
# loss for the epoch AND step
# the logger will show 2 charts: step_train_loss, epoch_train_loss
result.log('train_loss', loss, on_epoch=True)
Finally, you can use your own reduction function instead:
# the total sum for all batches of an epoch
result.log('train_loss', loss, on_epoch=True, reduce_fx=torch.sum)
def my_reduce_fx(all_train_loss):
# reduce somehow
return result
result.log('train_loss', loss, on_epoch=True, reduce_fx=my_reduce_fx)
Note
Use this ONLY in the case where your loop is simple and simply logs.
Finally, you may need more esoteric logging such as something specific to your logger like images:
def training_step(...):
result = TrainResult(some_metric)
result.log('train_loss', loss)
# also log images (if tensorboard for example)
self.logger.experiment.log_figure(...)
Sync across devices¶
When training on multiple GPUs/CPUs/TPU cores, calculate the global mean of a logged metric as follows:
result.log('train_loss', loss, sync_dist=True)
TrainResult API¶
-
class
pytorch_lightning.core.step_result.
TrainResult
(minimize=None, early_stop_on=None, checkpoint_on=None, hiddens=None)[source] Bases:
pytorch_lightning.core.step_result.Result
Used in train loop to auto-log to a logger or progress bar without needing to define a train_step_end or train_epoch_end method
Example:
def training_step(self, batch, batch_idx): loss = ... result = pl.TrainResult(loss) result.log('train_loss', loss) return result # without val/test loop can model checkpoint or early stop def training_step(self, batch, batch_idx): loss = ... result = pl.TrainResult(loss, early_stop_on=loss, checkpoint_on=loss) result.log('train_loss', loss) return result
- Parameters
-
log
(name, value, prog_bar=False, logger=True, on_step=True, on_epoch=False, reduce_fx=torch.mean, tbptt_reduce_fx=torch.mean, tbptt_pad_token=0, enable_graph=False, sync_dist=False, sync_dist_op='mean', sync_dist_group=None)[source] Log a key, value
Example:
result.log('train_loss', loss) # defaults used result.log( name, value, on_step=True, on_epoch=False, logger=True, prog_bar=False, reduce_fx=torch.mean, enable_graph=False )
- Parameters
name¶ – key name
value¶ – value name
on_step¶ (
bool
) – if True logs the output of validation_step or test_stepon_epoch¶ (
bool
) – if True, logs the output of the training loop aggregatedtbptt_reduce_fx¶ (
Callable
) – function to reduce on truncated back propenable_graph¶ (
bool
) – if True, will not auto detach the graphsync_dist¶ (
bool
) – if True, reduces the metric across GPUs/TPUs
-
log_dict
(dictionary, prog_bar=False, logger=True, on_step=False, on_epoch=True, reduce_fx=torch.mean, tbptt_reduce_fx=torch.mean, tbptt_pad_token=0, enable_graph=False, sync_dist=False, sync_dist_op='mean', sync_dist_group=None)[source] Log a dictonary of values at once
Example:
values = {'loss': loss, 'acc': acc, ..., 'metric_n': metric_n} result.log_dict(values)
- Parameters
on_step¶ (
bool
) – if True logs the output of validation_step or test_stepon_epoch¶ (
bool
) – if True, logs the output of the training loop aggregatedtbptt_reduce_fx¶ (
Callable
) – function to reduce on truncated back propenable_graph¶ (
bool
) – if True, will not auto detach the graphsync_dist¶ (
bool
) – if True, reduces the metric across GPUs/TPUs
EvalResult¶
The EvalResult object has the same usage as the TrainResult object.
def validation_step(...):
return EvalResult()
def test_step(...):
return EvalResult()
However, there are some differences:
Eval minimize¶
There is no minimize argument (since we don’t learn during validation)
Eval checkpoint/early_stopping¶
If defined in both the TrainResult and the EvalResult the one in the EvalResult will take precedence.
def training_step(...):
return TrainResult(loss, checkpoint_on=metric, early_stop_on=metric)
# metric_a and metric_b will be used for the callbacks and NOT metric
def validation_step(...):
return EvalResult(checkpoint_on=metric_a, early_stop_on=metric_b)
Eval logging¶
Logging has the same behavior as TrainResult but the logging defaults are different:
# TrainResult logs by default at each step only
TrainResult().log('val', val, on_step=True, on_epoch=False, logger=True, prog_bar=False, reduce_fx=torch.mean)
# EvalResult logs by default at the end of an epoch only
EvalResult().log('val', val, on_step=False, on_epoch=True, logger=True, prog_bar=False, reduce_fx=torch.mean)
Val/Test loop¶
Eval result can be used in both test_step and validation_step.
Sync across devices (v)¶
When training on multiple GPUs/CPUs/TPU cores, calculate the global mean of a logged metric as follows:
result.log('val_loss', loss, sync_dist=True)
EvalResult API¶
-
class
pytorch_lightning.core.step_result.
EvalResult
(early_stop_on=None, checkpoint_on=None, hiddens=None)[source] Bases:
pytorch_lightning.core.step_result.Result
Used in val/train loop to auto-log to a logger or progress bar without needing to define a _step_end or _epoch_end method
Example:
def validation_step(self, batch, batch_idx): loss = ... result = EvalResult() result.log('val_loss', loss) return result def test_step(self, batch, batch_idx): loss = ... result = EvalResult() result.log('val_loss', loss) return result
- Parameters
-
log
(name, value, prog_bar=False, logger=True, on_step=False, on_epoch=True, reduce_fx=torch.mean, tbptt_reduce_fx=torch.mean, tbptt_pad_token=0, enable_graph=False, sync_dist=False, sync_dist_op='mean', sync_dist_group=None)[source] Log a key, value
Example:
result.log('val_loss', loss) # defaults used result.log( name, value, on_step=False, on_epoch=True, logger=True, prog_bar=False, reduce_fx=torch.mean )
- Parameters
name¶ – key name
value¶ – value name
on_step¶ (
bool
) – if True logs the output of validation_step or test_stepon_epoch¶ (
bool
) – if True, logs the output of the training loop aggregatedtbptt_reduce_fx¶ (
Callable
) – function to reduce on truncated back propenable_graph¶ (
bool
) – if True, will not auto detach the graphsync_dist¶ (
bool
) – if True, reduces the metric across GPUs/TPUs
-
log_dict
(dictionary, prog_bar=False, logger=True, on_step=False, on_epoch=True, reduce_fx=torch.mean, tbptt_reduce_fx=torch.mean, tbptt_pad_token=0, enable_graph=False, sync_dist=False, sync_dist_op='mean', sync_dist_group=None)[source] Log a dictonary of values at once
Example:
values = {'loss': loss, 'acc': acc, ..., 'metric_n': metric_n} result.log_dict(values)
- Parameters
on_step¶ (
bool
) – if True logs the output of validation_step or test_stepon_epoch¶ (
bool
) – if True, logs the output of the training loop aggregatedtbptt_reduce_fx¶ (
Callable
) – function to reduce on truncated back propenable_graph¶ (
bool
) – if True, will not auto detach the graphsync_dist¶ (
bool
) – if True, reduces the metric across GPUs/TPUs
-
write
(name, values, filename='predictions.pt')[source] Add feature name and value pair to collection of predictions that will be written to disk on validation_end or test_end. If running on multiple GPUs, you will get separate n_gpu prediction files with the rank prepended onto filename.
Example:
result = pl.EvalResult() result.write('ids', [0, 1, 2]) result.write('preds', ['cat', 'dog', 'dog'])
-
write_dict
(predictions_dict, filename='predictions.pt')[source] Calls EvalResult.write() for each key-value pair in predictions_dict.
It is recommended that you use this function call instead of .write if you need to store more than one column of predictions in your output file.
Example:
predictions_to_write = {'preds': ['cat', 'dog'], 'ids': tensor([0, 1])} result.write_dict(predictions_to_write)