pytorch_lightning.metrics.sklearns module¶

class pytorch_lightning.metrics.sklearns.AUC(reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the Area Under the Curve using the trapoezoidal rule

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = AUC()
>>> metric(y_pred, y_true)
tensor([4.])

Parameters

reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(x, y)[source]¶

Computes the AUC

Parameters

x¶ (ndarray) – x coordinates.
y¶ (ndarray) – y coordinates.

Return type

float

Returns

AUC calculated with trapezoidal rule

class pytorch_lightning.metrics.sklearns.AUROC(average='macro', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute Area Under the Curve (AUC) from prediction scores

Note

this implementation is restricted to the binary classification task or multilabel classification task in label indicator format.

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Parameters

average¶ (Optional[str]) –
If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:
- If ‘micro’: Calculate metrics globally by considering each element of the label indicator matrix as a label.
- If ‘macro’: Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- If ‘weighted’: Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).
- If ‘samples’: Calculate metrics for each instance, and find their average.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_score, y_true, sample_weight=None)[source]¶

Parameters

y_score¶ (ndarray) – Target scores, can either be probability estimates of the positive class, confidence values, or binary decisions.
y_true¶ (ndarray) – True binary labels in binary label indicators.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

float

Returns

Area Under Receiver Operating Characteristic Curve

class pytorch_lightning.metrics.sklearns.Accuracy(normalize=True, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the Accuracy Score

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = Accuracy()
>>> metric(y_pred, y_true)
tensor([0.7500])

Parameters

normalize¶ (bool) – If False, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Computes the accuracy

Parameters

y_pred¶ (ndarray) – the array containing the predictions (already in categorical form)
y_true¶ (ndarray) – the array containing the targets (in categorical form)
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

float

Returns

Accuracy Score

class pytorch_lightning.metrics.sklearns.AveragePrecision(average='macro', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the average precision (AP) score.

Parameters

average¶ (Optional[str]) –
If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:
- If ‘micro’: Calculate metrics globally by considering each element of the label indicator matrix as a label.
- If ‘macro’: Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- If ‘weighted’: Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label).
- If ‘samples’: Calculate metrics for each instance, and find their average.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_score, y_true, sample_weight=None)[source]¶

Parameters

y_score¶ (ndarray) – Target scores, can either be probability estimates of the positive class, confidence values, or binary decisions.
y_true¶ (ndarray) – True binary labels in binary label indicators.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

float

Returns

average precision score

class pytorch_lightning.metrics.sklearns.BalancedAccuracy(adjusted=False, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute the balanced accuracy score

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([0, 0, 0, 1])
>>> y_true = torch.tensor([0, 0, 1, 1])
>>> metric = BalancedAccuracy()
>>> metric(y_pred, y_true)
tensor([0.7500])

Parameters

adjusted¶ (bool) – If True, the result sis adjusted for chance, such that random performance corresponds to 0 and perfect performance corresponds to 1
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – the array containing the predictions (already in categorical form)
y_true¶ (ndarray) – the array containing the targets (in categorical form)
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

float

Returns

balanced accuracy score

class pytorch_lightning.metrics.sklearns.CohenKappaScore(labels=None, weights=None, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates Cohens kappa: a statitic that measures inter-annotator agreement

Example

>>> y_pred = torch.tensor([1, 2, 0, 2])
>>> y_true = torch.tensor([2, 2, 2, 1])
>>> metric = CohenKappaScore()
>>> metric(y_pred, y_true)
tensor([-0.3333])

Parameters

labels¶ (Optional[Sequence]) – List of labels to index the matrix. This may be used to reorder or select a subset of labels. If none is given, those that appear at least once in y1 or y2 are used in sorted order.
weights¶ (Optional[str]) – string indicating weightning type used in scoring. None means no weighting, string linear means linear weighted and quadratic means quadratic weighted
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y1, y2, sample_weight=None)[source]¶

Parameters

y_1¶ – Labels assigned by first annotator
y_2¶ – Labels assigned by second annotator
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

float

Returns

Cohens kappa score

class pytorch_lightning.metrics.sklearns.ConfusionMatrix(labels=None, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute confusion matrix to evaluate the accuracy of a classification By definition a confusion matrix $C$ is such that $C_{i, j}$ is equal to the number of observations known to be in group $i$ but predicted to be in group $j$ .

Example

>>> y_pred = torch.tensor([0, 1, 2, 1])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = ConfusionMatrix()
>>> metric(y_pred, y_true)
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 1., 1.]])

Parameters

labels¶ (Optional[Sequence]) – List of labels to index the matrix. This may be used to reorder or select a subset of labels. If none is given, those that appear at least once in y_true or y_pred are used in sorted order.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated targets as returned by a classifier.
y_true¶ (ndarray) – Ground truth (correct) target values.

Return type

ndarray

Returns

Confusion matrix (array of shape [num_classes, num_classes])

class pytorch_lightning.metrics.sklearns.DCG(k=None, log_base=2, ignore_ties=False, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute discounted cumulative gain

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_score = torch.tensor([[.1, .2, .3, 4, 70]])
>>> y_true = torch.tensor([[10, 0, 0, 1, 5]])
>>> metric = DCG()
>>> metric(y_score, y_true)
tensor([9.4995])

Parameters

k¶ (Optional[int]) – only consider the hightest k score in the ranking
log_base¶ (float) – base of the logarithm used for the discount
ignore_ties¶ (bool) – If True, assume there are no ties in y_score for efficiency gains
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_score, y_true, sample_weight=None)[source]¶

Parameters

y_score¶ (ndarray) – target scores, either probability estimates, confidence values or or non-thresholded measure of decisions
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

float

Returns

DCG score

class pytorch_lightning.metrics.sklearns.ExplainedVariance(multioutput='variance_weighted', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates explained variance score

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2.5, 0.0, 2, 8])
>>> y_true = torch.tensor([3, -0.5, 2, 7])
>>> metric = ExplainedVariance()
>>> metric(y_pred, y_true)
tensor([0.9572])

Parameters

multioutput¶ (Union[str, List[float], None]) – either one of the strings [‘raw_values’, ‘uniform_average’, ‘variance_weighted’] or an array with shape (n_outputs,) that defines how multiple output values should be aggregated.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Explained variance score

class pytorch_lightning.metrics.sklearns.F1(labels=None, pos_label=1, average='macro', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute the F1 score, also known as balanced F-score or F-measure The F1 score can be interpreted as a weighted average of the precision and recall, where an F1 score reaches its best value at 1 and worst score at 0. The relative contribution of precision and recall to the F1 score are equal. The formula for the F1 score is:

$F_1 = 2 \cdot \frac{precision \cdot recall}{precision + recall}$

In the multi-class and multi-label case, this is the weighted average of the F1 score of each class.

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = F1()
>>> metric(y_pred, y_true)
tensor([0.6667])

References

[1] Wikipedia entry for the F1-score

Parameters

labels¶ (Optional[Sequence]) – Integer array of labels.
pos_label¶ (Union[str, int]) – The class to report if average='binary'.
average¶ (Optional[str]) –
This parameter is required for multiclass/multilabel targets. If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:
- 'binary': Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.
- 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score()).
Note that if pos_label is given in binary classification with average != ‘binary’, only that positive class is reported. This behavior is deprecated and will change in version 0.18.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated targets as returned by a classifier.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

Union[ndarray, float]

Returns

F1 score of the positive class in binary classification or weighted average of the F1 scores of each class for the multiclass task.

class pytorch_lightning.metrics.sklearns.FBeta(beta, labels=None, pos_label=1, average='macro', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute the F-beta score. The beta parameter determines the weight of precision in the combined score. beta < 1 lends more weight to precision, while beta > 1 favors recall (beta -> 0 considers only precision, beta -> inf only recall).

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = FBeta(beta=0.25)
>>> metric(y_pred, y_true)
tensor([0.7361])

References

[1] R. Baeza-Yates and B. Ribeiro-Neto (2011). Modern Information Retrieval. Addison Wesley, pp. 327-328.
[2] Wikipedia entry for the F1-score

Parameters

beta¶ (float) – Weight of precision in harmonic mean.
labels¶ (Optional[Sequence]) – Integer array of labels.
pos_label¶ (Union[str, int]) – The class to report if average='binary'.
average¶ (Optional[str]) –
This parameter is required for multiclass/multilabel targets. If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:
- 'binary': Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.
- 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score()).
Note that if pos_label is given in binary classification with average != ‘binary’, only that positive class is reported. This behavior is deprecated and will change in version 0.18.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated targets as returned by a classifier.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

Union[ndarray, float]

Returns

FBeta score of the positive class in binary classification or weighted average of the FBeta scores of each class for the multiclass task.

class pytorch_lightning.metrics.sklearns.Hamming(reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Computes the average hamming loss

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([1, 1, 2, 3])
>>> metric = Hamming()
>>> metric(y_pred, y_true)
tensor([0.2500])

Parameters

reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated targets as returned by a classifier.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

Union[ndarray, float]

Returns

Average hamming loss

class pytorch_lightning.metrics.sklearns.Hinge(labels=None, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Computes the average hinge loss

Example

>>> pred_decision = torch.tensor([-2.17, -0.97, -0.19, -0.43])
>>> y_true = torch.tensor([1, 1, 0, 0])
>>> metric = Hinge()
>>> metric(pred_decision, y_true)
tensor([1.6300])

Parameters

labels¶ (Optional[Sequence]) – Integer array of labels.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(pred_decision, y_true, sample_weight=None)[source]¶

Parameters

pred_decision¶ (ndarray) – Predicted decisions
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

float

Returns

Average hinge loss

class pytorch_lightning.metrics.sklearns.Jaccard(labels=None, pos_label=1, average='macro', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates jaccard similarity coefficient score

Example

>>> y_pred = torch.tensor([1, 1, 1])
>>> y_true = torch.tensor([0, 1, 1])
>>> metric = Jaccard()
>>> metric(y_pred, y_true)
tensor([0.3333])

Parameters

labels¶ (Optional[Sequence]) – Integer array of labels.
pos_label¶ (Union[str, int]) – The class to report if average='binary'.
average¶ (Optional[str]) –
This parameter is required for multiclass/multilabel targets. If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:
- 'binary': Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.
- 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score()).
Note that if pos_label is given in binary classification with average != ‘binary’, only that positive class is reported. This behavior is deprecated and will change in version 0.18.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated targets as returned by a classifier.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

Union[ndarray, float]

Returns

Jaccard similarity score

class pytorch_lightning.metrics.sklearns.MeanAbsoluteError(multioutput='uniform_average', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute absolute error regression loss

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2.5, 0.0, 2, 8])
>>> y_true = torch.tensor([3, -0.5, 2, 7])
>>> metric = MeanAbsoluteError()
>>> metric(y_pred, y_true)
tensor([0.5000])

Parameters

multioutput¶ (Union[str, List[float], None]) – either one of the strings [‘raw_values’, ‘uniform_average’] or an array with shape (n_outputs,) that defines how multiple output values should be aggregated.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Mean absolute error

class pytorch_lightning.metrics.sklearns.MeanGammaDeviance(reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the mean gamma deviance regression loss

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([0.5, 0.5, 2., 2.])
>>> y_true = torch.tensor([2, 0.5, 1, 4])
>>> metric = MeanGammaDeviance()
>>> metric(y_pred, y_true)
tensor([1.0569])

Parameters

reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Mean gamma deviance

class pytorch_lightning.metrics.sklearns.MeanPoissonDeviance(reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the mean poisson deviance regression loss

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2, 0.5, 1, 4])
>>> y_true = torch.tensor([0.5, 0.5, 2., 2.])
>>> metric = MeanPoissonDeviance()
>>> metric(y_pred, y_true)
tensor([0.9034])

Parameters

reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Mean possion deviance

class pytorch_lightning.metrics.sklearns.MeanSquaredError(multioutput='uniform_average', squared=False, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute mean squared error loss

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2.5, 0.0, 2, 8])
>>> y_true = torch.tensor([3, -0.5, 2, 7])
>>> metric = MeanSquaredError()
>>> metric(y_pred, y_true)
tensor([0.3750])
>>> metric = MeanSquaredError(squared=True)
>>> metric(y_pred, y_true)
tensor([0.6124])

Parameters

multioutput¶ (Union[str, List[float], None]) – either one of the strings [‘raw_values’, ‘uniform_average’] or an array with shape (n_outputs,) that defines how multiple output values should be aggregated.
squared¶ (bool) – if True returns the mse value else the rmse value
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Mean squared error

class pytorch_lightning.metrics.sklearns.MeanSquaredLogError(multioutput='uniform_average', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the mean squared log error

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2.5, 5, 4, 8])
>>> y_true = torch.tensor([3, 5, 2.5, 7])
>>> metric = MeanSquaredLogError()
>>> metric(y_pred, y_true)
tensor([0.0397])

Parameters

multioutput¶ (Union[str, List[float], None]) – either one of the strings [‘raw_values’, ‘uniform_average’] or an array with shape (n_outputs,) that defines how multiple output values should be aggregated.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Mean squared log error

class pytorch_lightning.metrics.sklearns.MeanTweedieDeviance(power=0, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the mean tweedie deviance regression loss

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2, 0.5, 1, 4])
>>> y_true = torch.tensor([0.5, 0.5, 2., 2.])
>>> metric = MeanTweedieDeviance()
>>> metric(y_pred, y_true)
tensor([1.8125])

Parameters

power¶ (float) –
tweedie power parameter:
- power < 0: Extreme stable distribution. Requires: y_pred > 0.
- power = 0Normal distribution, output corresponds to mean_squared_error.
  y_true and y_pred can be any real numbers.
- power = 1 : Poisson distribution. Requires: y_true >= 0 and y_pred > 0.
- 1 < power < 2 : Compound Poisson distribution. Requires: y_true >= 0 and y_pred > 0.
- power = 2 : Gamma distribution. Requires: y_true > 0 and y_pred > 0.
- power = 3 : Inverse Gaussian distribution. Requires: y_true > 0 and y_pred > 0.
- otherwise : Positive stable distribution. Requires: y_true > 0 and y_pred > 0.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Mean tweedie deviance

class pytorch_lightning.metrics.sklearns.MedianAbsoluteError(multioutput='uniform_average', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the median absolute error

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2.5, 0.0, 2, 8])
>>> y_true = torch.tensor([3, -0.5, 2, 7])
>>> metric = MedianAbsoluteError()
>>> metric(y_pred, y_true)
tensor([0.5000])

Parameters

multioutput¶ (Union[str, List[float], None]) – either one of the strings [‘raw_values’, ‘uniform_average’] or an array with shape (n_outputs,) that defines how multiple output values should be aggregated.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.

Returns

Median absolute error

class pytorch_lightning.metrics.sklearns.Precision(labels=None, pos_label=1, average='macro', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute the precision The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative. The best value is 1 and the worst value is 0.

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = Precision()
>>> metric(y_pred, y_true)
tensor([0.7500])

Parameters

labels¶ (Optional[Sequence]) – Integer array of labels.
pos_label¶ (Union[str, int]) – The class to report if average='binary'.
average¶ (Optional[str]) –
This parameter is required for multiclass/multilabel targets. If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:
- 'binary': Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.
- 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score()).
Note that if pos_label is given in binary classification with average != ‘binary’, only that positive class is reported. This behavior is deprecated and will change in version 0.18.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated targets as returned by a classifier.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

Union[ndarray, float]

Returns

Precision of the positive class in binary classification or weighted average of the precision of each class for the multiclass task.

class pytorch_lightning.metrics.sklearns.PrecisionRecallCurve(pos_label=1, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute precision-recall pairs for different probability thresholds

Note

This implementation is restricted to the binary classification task.

The precision is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives. The precision is intuitively the ability of the classifier not to label as positive a sample that is negative. The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples. The last precision and recall values are 1. and 0. respectively and do not have a corresponding threshold. This ensures that the graph starts on the x axis.

Parameters

pos_label¶ (Union[str, int]) – The class to report if average='binary'.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(probas_pred, y_true, sample_weight=None)[source]¶

Parameters

probas_pred¶ (ndarray) – Estimated probabilities or decision function.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Precision values such that element i is the precision of: predictions with score >= thresholds[i] and the last element is 1.
recall:: Decreasing recall values such that element i is the recall of predictions with score >= thresholds[i] and the last element is 0.
thresholds:: Increasing thresholds on the decision function used to compute precision and recall.

Return type

precision

class pytorch_lightning.metrics.sklearns.R2Score(multioutput='uniform_average', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Calculates the R^2 score also known as coefficient of determination

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([2.5, 0.0, 2, 8])
>>> y_true = torch.tensor([3, -0.5, 2, 7])
>>> metric = R2Score()
>>> metric(y_pred, y_true)
tensor([0.9486])

Parameters

multioutput¶ (Union[str, List[float], None]) – either one of the strings [‘raw_values’, ‘uniform_average’, ‘variance_weighted’] or an array with shape (n_outputs,) that defines how multiple output values should be aggregated.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated target values
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

R^2 score

class pytorch_lightning.metrics.sklearns.ROC(pos_label=1, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute Receiver operating characteristic (ROC)

Note

this implementation is restricted to the binary classification task.

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = ROC()
>>> fps, tps = metric(y_pred, y_true)
>>> fps
tensor([0.0000, 0.3333, 0.6667, 0.6667, 1.0000])
>>> tps
tensor([0., 0., 0., 1., 1.])

References

[1] Wikipedia entry for the Receiver operating characteristic

Parameters

pos_labels¶ – The class to report if average='binary'.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_score, y_true, sample_weight=None)[source]¶

Parameters

y_score¶ (ndarray) – Target scores, can either be probability estimates of the positive class or confidence values.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Returns

Increasing false positive rates such that element i is the false: positive rate of predictions with score >= thresholds[i].
tpr:: Increasing true positive rates such that element i is the true positive rate of predictions with score >= thresholds[i].
thresholds:: Decreasing thresholds on the decision function used to compute fpr and tpr. thresholds[0] represents no instances being predicted and is arbitrarily set to max(y_score) + 1.

Return type

fpr

class pytorch_lightning.metrics.sklearns.Recall(labels=None, pos_label=1, average='macro', reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM)[source]¶

Bases: pytorch_lightning.metrics.sklearns.SklearnMetric

Compute the recall The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples. The best value is 1 and the worst value is 0.

Example

>>> y_pred = torch.tensor([0, 1, 2, 3])
>>> y_true = torch.tensor([0, 1, 2, 2])
>>> metric = Recall()
>>> metric(y_pred, y_true)
tensor([0.6250])

Parameters

labels¶ (Optional[Sequence]) – Integer array of labels.
pos_label¶ (Union[str, int]) – The class to report if average='binary'.
average¶ (Optional[str]) –
This parameter is required for multiclass/multilabel targets. If None, the scores for each class are returned. Otherwise, this determines the type of averaging performed on the data:
- 'binary': Only report results for the class specified by pos_label. This is applicable only if targets (y_{true,pred}) are binary.
- 'micro': Calculate metrics globally by counting the total true positives, false negatives and false positives.
- 'macro': Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
- 'weighted': Calculate metrics for each label, and find their average, weighted by support (the number of true instances for each label). This alters ‘macro’ to account for label imbalance; it can result in an F-score that is not between precision and recall.
- 'samples': Calculate metrics for each instance, and find their average (only meaningful for multilabel classification where this differs from accuracy_score()).
Note that if pos_label is given in binary classification with average != ‘binary’, only that positive class is reported. This behavior is deprecated and will change in version 0.18.
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.

forward(y_pred, y_true, sample_weight=None)[source]¶

Parameters

y_pred¶ (ndarray) – Estimated targets as returned by a classifier.
y_true¶ (ndarray) – Ground truth (correct) target values.
sample_weight¶ (Optional[ndarray]) – Sample weights.

Return type

Union[ndarray, float]

Returns

Recall of the positive class in binary classification or weighted average of the recall of each class for the multiclass task.

class pytorch_lightning.metrics.sklearns.SklearnMetric(metric_name, reduce_group=torch.distributed.group.WORLD, reduce_op=torch.distributed.ReduceOp.SUM, **kwargs)[source]¶

Bases: pytorch_lightning.metrics.metric.NumpyMetric

Bridge between PyTorch Lightning and scikit-learn metrics

Warning

Every metric call will cause a GPU synchronization, which may slow down your code

Note

The order of targets and predictions may be different from the order typically used in PyTorch

Parameters

metric_name¶ (str) – the metric name to import and compute from scikit-learn.metrics
reduce_group¶ (Any) – the process group for DDP reduces (only needed for DDP training). Defaults to all processes (world)
reduce_op¶ (Any) – the operation to perform during reduction within DDP (only needed for DDP training). Defaults to sum.
**kwargs¶ – additonal keyword arguments (will be forwarded to metric call)

forward(*args, **kwargs)[source]¶

Carries the actual metric computation

Parameters

*args¶ – Positional arguments forwarded to metric call (should be already converted to numpy)
**kwargs¶ – keyword arguments forwarded to metric call (should be already converted to numpy)

Return type

Union[ndarray, int, float]

Returns

the metric value (will be converted to tensor by baseclass)

property metric_fn[source]¶