Skip to content

Logit outlier scores

Use this page when you want to turn classifier outputs into one confidence score per row, especially when the model prediction is not itself a directly interpretable risk score.

Logit-derived outlier score functions for confidence and OOD monitoring.

These post-hoc methods are intended for pre-trained classifiers and return outlier scores that can be used to rank inputs by in-distribution confidence.

References

Liang, J., Hou, R., Hu, M., Chang, H., Shan, S., & Chen, X. (2025). Revisiting Logit Distributions for Reliable Out-of-Distribution Detection. arXiv:2510.20134v1. https://arxiv.org/html/2510.20134v1

logit_gap(logits)

LogitGap OOD outlier score.

Compute the average gap between the largest logit and the remaining logits for each sample.

Intuitively, in-distribution samples tend to have a dominant class logit, while out-of-distribution samples often have flatter logit profiles.

The outlier-scoring function is defined as:

.. math::

S_{\text{LogitGap}}(x; f) = \frac{1}{K-1}
\sum_{j=2}^{K} (z'_1 - z'_j)

where :math:z'_1 is the maximum logit and :math:z'_j are logits sorted in descending order. Higher outlier scores indicate higher confidence for ID samples.

Parameters:

Name Type Description Default
logits NDArray

Array of shape (n_samples, n_classes) containing raw logits from a pre-trained classification model.

required

Returns:

Type Description
NDArray

Array of shape (n_samples,) containing OOD outlier scores. Higher scores indicate higher likelihood of being in-distribution.

Raises:

Type Description
ValueError

If logits is not a finite 2D array with at least two classes.

See Also

max_logit : Simple baseline using only the maximum logit value.

Notes

The implementation uses :math:\max_k z_k - \frac{1}{K-1}\sum_{j \ne k^*} z_j, where :math:k^* is the index of the maximum logit. This is equivalent to the formulation in [1]_.

References

.. [1] Liang, Jiachen, et al. "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection." The Thirty-ninth Annual Conference on Neural Information Processing Systems. 2025.

Examples:

>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> np.round(logit_gap(logits), 2)
array([4.25, 0.15], dtype=float32)
>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> outlier_scores = logit_gap(logits)
>>> print(outlier_scores)
[4.25 0.15]
Source code in src/samesame/logit_scores.py
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
def logit_gap(logits: NDArray) -> NDArray:
    """LogitGap OOD outlier score.

    Compute the average gap between the largest logit and the remaining logits
    for each sample.

    Intuitively, in-distribution samples tend to have a dominant class logit,
    while out-of-distribution samples often have flatter logit profiles.

    The outlier-scoring function is defined as:

    .. math::

        S_{\\text{LogitGap}}(x; f) = \\frac{1}{K-1}
        \\sum_{j=2}^{K} (z'_1 - z'_j)

    where :math:`z'_1` is the maximum logit and :math:`z'_j` are logits
    sorted in descending order. Higher outlier scores indicate higher confidence
    for ID samples.

    Parameters
    ----------
    logits : NDArray
        Array of shape (n_samples, n_classes) containing raw logits from
        a pre-trained classification model.

    Returns
    -------
    NDArray
        Array of shape (n_samples,) containing OOD outlier scores. Higher scores
        indicate higher likelihood of being in-distribution.

    Raises
    ------
    ValueError
        If ``logits`` is not a finite 2D array with at least two classes.

    See Also
    --------
    max_logit : Simple baseline using only the maximum logit value.

    Notes
    -----
    The implementation uses
    :math:`\\max_k z_k - \\frac{1}{K-1}\\sum_{j \\ne k^*} z_j`, where
    :math:`k^*` is the index of the maximum logit. This is equivalent to the
    formulation in [1]_.

    References
    ----------
    .. [1] Liang, Jiachen, et al.
       "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection."
       The Thirty-ninth Annual Conference on Neural Information Processing Systems.
       2025.

    Examples
    --------
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> np.round(logit_gap(logits), 2)
    array([4.25, 0.15], dtype=float32)
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> outlier_scores = logit_gap(logits)  # doctest: +SKIP
    >>> print(outlier_scores)
    [4.25 0.15]
    """
    logits = _validate_logits(logits)
    n_classes = logits.shape[1]
    max_logits = np.max(logits, axis=1)
    mean_rest = (np.sum(logits, axis=1) - max_logits) / (n_classes - 1)
    return max_logits - mean_rest

max_logit(logits)

MaxLogit OOD outlier score (baseline method).

Compute the maximum logit for each sample.

This is a simple baseline that uses only the top class logit and ignores the rest of the logit vector.

The outlier-scoring function is defined as:

.. math::

S_{\text{MaxLogit}}(x; f) = \max_k z_k

where :math:z_k is the logit for class k. Higher outlier scores indicate higher confidence for the predicted class.

Parameters:

Name Type Description Default
logits NDArray

Array of shape (n_samples, n_classes) containing raw logits from a pre-trained classification model.

required

Returns:

Type Description
NDArray

Array of shape (n_samples,) containing OOD outlier scores (maximum logits). Higher scores indicate higher confidence, but with limited discriminative power for OOD detection compared to LogitGap.

Raises:

Type Description
ValueError

If logits is not a finite 2D array with at least two classes.

See Also

logit_gap : Improved OOD detection using logit gap.

Notes

MaxLogit is primarily a baseline comparator for richer logit-distribution methods such as :func:logit_gap [1].

References

.. [1] Liang, Jiachen, et al. "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection." The Thirty-ninth Annual Conference on Neural Information Processing Systems. 2025.

Examples:

>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> max_logit(logits)
array([5. , 2.1], dtype=float32)
>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> outlier_scores = max_logit(logits)
>>> print(outlier_scores)
[5.0 2.1]
Source code in src/samesame/logit_scores.py
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
def max_logit(logits: NDArray) -> NDArray:
    """MaxLogit OOD outlier score (baseline method).

    Compute the maximum logit for each sample.

    This is a simple baseline that uses only the top class logit and ignores
    the rest of the logit vector.

    The outlier-scoring function is defined as:

    .. math::

        S_{\\text{MaxLogit}}(x; f) = \\max_k z_k

    where :math:`z_k` is the logit for class k. Higher outlier scores indicate
    higher confidence for the predicted class.

    Parameters
    ----------
    logits : NDArray
        Array of shape (n_samples, n_classes) containing raw logits from
        a pre-trained classification model.

    Returns
    -------
    NDArray
        Array of shape (n_samples,) containing OOD outlier scores (maximum logits).
        Higher scores indicate higher confidence, but with limited
        discriminative power for OOD detection compared to LogitGap.

    Raises
    ------
    ValueError
        If ``logits`` is not a finite 2D array with at least two classes.

    See Also
    --------
    logit_gap : Improved OOD detection using logit gap.

    Notes
    -----
    MaxLogit is primarily a baseline comparator for richer logit-distribution
    methods such as :func:`logit_gap` [1].

    References
    ----------
    .. [1] Liang, Jiachen, et al.
       "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection."
       The Thirty-ninth Annual Conference on Neural Information Processing Systems.
       2025.

    Examples
    --------
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> max_logit(logits)
    array([5. , 2.1], dtype=float32)
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> outlier_scores = max_logit(logits)  # doctest: +SKIP
    >>> print(outlier_scores)
    [5.0 2.1]
    """
    logits = _validate_logits(logits)
    return np.max(logits, axis=1)