
pytorch_lightning.metrics.classification module

class pytorch_lightning.metrics.classification.AUROC(pos_label=1, reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the area under curve (AUC) of the receiver operator characteristic (ROC)


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = AUROC()
>>> metric(pred, target)
  • pos_label (int) – positive label indicator

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target, sample_weight=None)[source]

Actual metric computation


classification score

Return type


_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.Accuracy(num_classes=None, reduction='elementwise_mean', reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the accuracy classification score


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = Accuracy()
>>> metric(pred, target)
  • num_classes (Optional[int]) – number of classes

  • reduction (str) – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target)[source]

Actual metric computation

  • pred (Tensor) – predicted labels

  • target (Tensor) – ground truth labels

Return type



A Tensor with the classification score.

_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.AveragePrecision(pos_label=1, reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the average precision score


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = AveragePrecision()
>>> metric(pred, target)
  • pos_label (int) – positive label indicator

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target, sample_weight=None)[source]

Actual metric computation


classification score

Return type


_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.ConfusionMatrix(normalize=False, reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the confusion matrix C where each entry C_{i,j} is the number of observations in group i that were predicted in group j.


>>> pred = torch.tensor([0, 1, 2, 2])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = ConfusionMatrix()
>>> metric(pred, target)
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 2.]])
  • normalize (bool) – whether to compute a normalized confusion matrix

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target)[source]

Actual metric computation

  • pred (Tensor) – predicted labels

  • target (Tensor) – ground truth labels

Return type



A Tensor with the confusion matrix.

_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.DiceCoefficient(include_background=False, nan_score=0.0, no_fg_score=0.0, reduction='elementwise_mean', reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the dice coefficient


>>> pred = torch.tensor([[0.85, 0.05, 0.05, 0.05],
...                      [0.05, 0.85, 0.05, 0.05],
...                      [0.05, 0.05, 0.85, 0.05],
...                      [0.05, 0.05, 0.05, 0.85]])
>>> target = torch.tensor([0, 1, 3, 2])
>>> metric = DiceCoefficient()
>>> metric(pred, target)
  • include_background (bool) – whether to also compute dice for the background

  • nan_score (float) – score to return, if a NaN occurs during computation (denom zero)

  • no_fg_score (float) – score to return, if no foreground pixel was found in target

  • reduction (str) – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target)[source]

Actual metric computation

  • pred (Tensor) – predicted probability for each label

  • target (Tensor) – groundtruth labels


the calculated dice coefficient

Return type


_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.F1(num_classes=None, reduction='elementwise_mean', reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the F1 score, which is the harmonic mean of the precision and recall. It ranges between 1 and 0, where 1 is perfect and the worst value is 0.


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = F1()
>>> metric(pred, target)
  • num_classes (Optional[int]) – number of classes

  • reduction (str) – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target)[source]

Actual metric computation


classification score

Return type


_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.FBeta(beta, num_classes=None, reduction='elementwise_mean', reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the FBeta Score, which is the weighted harmonic mean of precision and recall.

It ranges between 1 and 0, where 1 is perfect and the worst value is 0.


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = FBeta(0.25)
>>> metric(pred, target)
  • beta (float) – determines the weight of recall in the combined score.

  • num_classes (Optional[int]) – number of classes

  • reduction (str) – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for DDP reduction

forward(pred, target)[source]

Actual metric computation


classification score

Return type


_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.IoU(remove_bg=False, reduction='elementwise_mean')[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the intersection over union.


>>> pred = torch.tensor([[0, 0, 0, 0, 0, 0, 0, 0],
...                      [0, 0, 1, 1, 1, 0, 0, 0],
...                      [0, 0, 0, 0, 0, 0, 0, 0]])
>>> target = torch.tensor([[0, 0, 0, 0, 0, 0, 0, 0],
...                        [0, 0, 0, 1, 1, 1, 0, 0],
...                        [0, 0, 0, 0, 0, 0, 0, 0]])
>>> metric = IoU()
>>> metric(pred, target)
  • remove_bg (bool) – Flag to state whether a background class has been included within input parameters. If true, will remove background class. If false, return IoU over all classes. Assumes that background is ‘0’ class in input tensor

  • reduction (str) –

    a method for reducing IoU over labels (default: takes the mean) Available reduction methods:

    • elementwise_mean: takes the mean

    • none: pass array

    • sum: add elements

forward(y_pred, y_true, sample_weight=None)[source]

Actual metric calculation.

_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.MulticlassPrecisionRecall(num_classes=None, reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorCollectionMetric

Computes the multiclass PR Curve


>>> pred = torch.tensor([[0.85, 0.05, 0.05, 0.05],
...                     [0.05, 0.85, 0.05, 0.05],
...                     [0.05, 0.05, 0.85, 0.05],
...                     [0.05, 0.05, 0.05, 0.85]])
>>> target = torch.tensor([0, 1, 3, 2])
>>> metric = MulticlassPrecisionRecall()
>>> metric(pred, target)   
((tensor([1., 1.]), tensor([1., 0.]), tensor([0.8500])),
 (tensor([1., 1.]), tensor([1., 0.]), tensor([0.8500])),
 (tensor([0.2500, 0.0000, 1.0000]), tensor([1., 0., 0.]), tensor([0.0500, 0.8500])),
 (tensor([0.2500, 0.0000, 1.0000]), tensor([1., 0., 0.]), tensor([0.0500, 0.8500])))
  • num_classes (Optional[int]) – number of classes

  • reduction – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target, sample_weight=None)[source]

Actual metric computation

  • pred (Tensor) – predicted probability for each label

  • target (Tensor) – groundtruth labels

  • sample_weight (Optional[Sequence]) – Weights for each sample defining the sample’s impact on the score


A tuple consisting of one tuple per class, holding precision, recall and thresholds

Return type


_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.MulticlassROC(num_classes=None, reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorCollectionMetric

Computes the multiclass ROC


>>> pred = torch.tensor([[0.85, 0.05, 0.05, 0.05],
...                     [0.05, 0.85, 0.05, 0.05],
...                     [0.05, 0.05, 0.85, 0.05],
...                     [0.05, 0.05, 0.05, 0.85]])
>>> target = torch.tensor([0, 1, 3, 2])
>>> metric = MulticlassROC()
>>> classes_roc = metric(pred, target)
>>> metric(pred, target)   
((tensor([0., 0., 1.]), tensor([0., 1., 1.]), tensor([1.8500, 0.8500, 0.0500])),
 (tensor([0., 0., 1.]), tensor([0., 1., 1.]), tensor([1.8500, 0.8500, 0.0500])),
 (tensor([0.0000, 0.3333, 1.0000]), tensor([0., 0., 1.]), tensor([1.8500, 0.8500, 0.0500])),
 (tensor([0.0000, 0.3333, 1.0000]), tensor([0., 0., 1.]), tensor([1.8500, 0.8500, 0.0500])))
  • num_classes (Optional[int]) – number of classes

  • reduction – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target, sample_weight=None)[source]

Actual metric computation

  • pred (Tensor) – predicted probability for each label

  • target (Tensor) – groundtruth labels

  • sample_weight (Optional[Sequence]) – Weights for each sample defining the sample’s impact on the score


A tuple consisting of one tuple per class, holding false positive rate, true positive rate and thresholds

Return type


_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.Precision(num_classes=None, reduction='elementwise_mean', reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the precision score


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = Precision(num_classes=4)
>>> metric(pred, target)
  • num_classes (Optional[int]) – number of classes

  • reduction (str) – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target)[source]

Actual metric computation

  • pred (Tensor) – predicted labels

  • target (Tensor) – ground truth labels

Return type



A Tensor with the classification score.

_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.PrecisionRecall(pos_label=1, reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorCollectionMetric

Computes the precision recall curve


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = PrecisionRecall()
>>> prec, recall, thr = metric(pred, target)
>>> prec
tensor([0.3333, 0.0000, 0.0000, 1.0000])
>>> recall
tensor([1., 0., 0., 0.])
>>> thr
tensor([1., 2., 3.])
  • pos_label (int) – positive label indicator

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target, sample_weight=None)[source]

Actual metric computation

Return type

Tuple[Tensor, Tensor, Tensor]


  • precision values

  • recall values

  • threshold values

_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.ROC(pos_label=1, reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorCollectionMetric

Computes the Receiver Operator Characteristic (ROC)


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = ROC()
>>> fps, tps, thresholds = metric(pred, target)
>>> fps
tensor([0.0000, 0.3333, 0.6667, 0.6667, 1.0000])
>>> tps
tensor([0., 0., 0., 1., 1.])
>>> thresholds
tensor([4., 3., 2., 1., 0.])
  • pos_label (int) – positive label indicator

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target, sample_weight=None)[source]

Actual metric computation

Return type

Tuple[Tensor, Tensor, Tensor]


  • false positive rate

  • true positive rate

  • thresholds

_device = None[source]
_dtype = None[source]
class pytorch_lightning.metrics.classification.Recall(num_classes=None, reduction='elementwise_mean', reduce_group=None, reduce_op=None)[source]

Bases: pytorch_lightning.metrics.metric.TensorMetric

Computes the recall score


>>> pred = torch.tensor([0, 1, 2, 3])
>>> target = torch.tensor([0, 1, 2, 2])
>>> metric = Recall()
>>> metric(pred, target)
  • num_classes (Optional[int]) – number of classes

  • reduction (str) – a method for reducing accuracies over labels (default: takes the mean) Available reduction methods: - elementwise_mean: takes the mean - none: pass array - sum: add elements

  • reduce_group (Optional[Any]) – the process group to reduce metric results from DDP

  • reduce_op (Optional[Any]) – the operation to perform for ddp reduction

forward(pred, target)[source]

Actual metric computation

  • pred (Tensor) – predicted labels

  • target (Tensor) – ground truth labels

Return type



A Tensor with the classification score.

_device = None[source]
_dtype = None[source]