Logit outlier scores¶

Use this page when you want to turn classifier outputs into one confidence score per row, especially when the model prediction is not itself a directly interpretable risk score.

Logit-derived outlier score functions for confidence and OOD monitoring.

These post-hoc methods are intended for pre-trained classifiers and return outlier scores that can be used to rank inputs by in-distribution confidence.

References

Liang, J., Hou, R., Hu, M., Chang, H., Shan, S., & Chen, X. (2025). Revisiting Logit Distributions for Reliable Out-of-Distribution Detection. arXiv:2510.20134v1. https://arxiv.org/html/2510.20134v1

`logit_gap(logits)` ¶

LogitGap OOD outlier score.

Compute the average gap between the largest logit and the remaining logits for each sample.

Intuitively, in-distribution samples tend to have a dominant class logit, while out-of-distribution samples often have flatter logit profiles.

The outlier-scoring function is defined as:

.. math::

S_{\text{LogitGap}}(x; f) = \frac{1}{K-1}
\sum_{j=2}^{K} (z'_1 - z'_j)

where :math:z'_1 is the maximum logit and :math:z'_j are logits sorted in descending order. Higher outlier scores indicate higher confidence for ID samples.

Parameters:

Name	Type	Description	Default
`logits`	`NDArray`	Array of shape (n_samples, n_classes) containing raw logits from a pre-trained classification model.	required

Returns:

Type	Description
`NDArray`	Array of shape (n_samples,) containing OOD outlier scores. Higher scores indicate higher likelihood of being in-distribution.

Raises:

Type	Description
`ValueError`	If `logits` is not a finite 2D array with at least two classes.

See Also

max_logit : Simple baseline using only the maximum logit value.

Notes

The implementation uses :math:\max_k z_k - \frac{1}{K-1}\sum_{j \ne k^*} z_j, where :math:k^* is the index of the maximum logit. This is equivalent to the formulation in [1]_.

References

.. [1] Liang, Jiachen, et al. "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection." The Thirty-ninth Annual Conference on Neural Information Processing Systems. 2025.

Examples:

>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> np.round(logit_gap(logits), 2)
array([4.25, 0.15], dtype=float32)
>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> outlier_scores = logit_gap(logits)
>>> print(outlier_scores)
[4.25 0.15]

Source code in src/samesame/logit_scores.py

def logit_gap(logits: NDArray) -> NDArray:
    """LogitGap OOD outlier score.

    Compute the average gap between the largest logit and the remaining logits
    for each sample.

    Intuitively, in-distribution samples tend to have a dominant class logit,
    while out-of-distribution samples often have flatter logit profiles.

    The outlier-scoring function is defined as:

    .. math::

        S_{\\text{LogitGap}}(x; f) = \\frac{1}{K-1}
        \\sum_{j=2}^{K} (z'_1 - z'_j)

    where :math:`z'_1` is the maximum logit and :math:`z'_j` are logits
    sorted in descending order. Higher outlier scores indicate higher confidence
    for ID samples.

    Parameters
    ----------
    logits : NDArray
        Array of shape (n_samples, n_classes) containing raw logits from
        a pre-trained classification model.

    Returns
    -------
    NDArray
        Array of shape (n_samples,) containing OOD outlier scores. Higher scores
        indicate higher likelihood of being in-distribution.

    Raises
    ------
    ValueError
        If ``logits`` is not a finite 2D array with at least two classes.

    See Also
    --------
    max_logit : Simple baseline using only the maximum logit value.

    Notes
    -----
    The implementation uses
    :math:`\\max_k z_k - \\frac{1}{K-1}\\sum_{j \\ne k^*} z_j`, where
    :math:`k^*` is the index of the maximum logit. This is equivalent to the
    formulation in [1]_.

    References
    ----------
    .. [1] Liang, Jiachen, et al.
       "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection."
       The Thirty-ninth Annual Conference on Neural Information Processing Systems.
       2025.

    Examples
    --------
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> np.round(logit_gap(logits), 2)
    array([4.25, 0.15], dtype=float32)
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> outlier_scores = logit_gap(logits)  # doctest: +SKIP
    >>> print(outlier_scores)
    [4.25 0.15]
    """
    logits = _validate_logits(logits)
    n_classes = logits.shape[1]
    max_logits = np.max(logits, axis=1)
    mean_rest = (np.sum(logits, axis=1) - max_logits) / (n_classes - 1)
    return max_logits - mean_rest

`max_logit(logits)` ¶

MaxLogit OOD outlier score (baseline method).

Compute the maximum logit for each sample.

This is a simple baseline that uses only the top class logit and ignores the rest of the logit vector.

The outlier-scoring function is defined as:

.. math::

S_{\text{MaxLogit}}(x; f) = \max_k z_k

where :math:z_k is the logit for class k. Higher outlier scores indicate higher confidence for the predicted class.

Parameters:

Name	Type	Description	Default
`logits`	`NDArray`	Array of shape (n_samples, n_classes) containing raw logits from a pre-trained classification model.	required

Returns:

Type	Description
`NDArray`	Array of shape (n_samples,) containing OOD outlier scores (maximum logits). Higher scores indicate higher confidence, but with limited discriminative power for OOD detection compared to LogitGap.

Raises:

Type	Description
`ValueError`	If `logits` is not a finite 2D array with at least two classes.

See Also

logit_gap : Improved OOD detection using logit gap.

Notes

MaxLogit is primarily a baseline comparator for richer logit-distribution methods such as :func:logit_gap [1].

References

.. [1] Liang, Jiachen, et al. "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection." The Thirty-ninth Annual Conference on Neural Information Processing Systems. 2025.

Examples:

>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> max_logit(logits)
array([5. , 2.1], dtype=float32)
>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> outlier_scores = max_logit(logits)
>>> print(outlier_scores)
[5.0 2.1]

Source code in src/samesame/logit_scores.py

def max_logit(logits: NDArray) -> NDArray:
    """MaxLogit OOD outlier score (baseline method).

    Compute the maximum logit for each sample.

    This is a simple baseline that uses only the top class logit and ignores
    the rest of the logit vector.

    The outlier-scoring function is defined as:

    .. math::

        S_{\\text{MaxLogit}}(x; f) = \\max_k z_k

    where :math:`z_k` is the logit for class k. Higher outlier scores indicate
    higher confidence for the predicted class.

    Parameters
    ----------
    logits : NDArray
        Array of shape (n_samples, n_classes) containing raw logits from
        a pre-trained classification model.

    Returns
    -------
    NDArray
        Array of shape (n_samples,) containing OOD outlier scores (maximum logits).
        Higher scores indicate higher confidence, but with limited
        discriminative power for OOD detection compared to LogitGap.

    Raises
    ------
    ValueError
        If ``logits`` is not a finite 2D array with at least two classes.

    See Also
    --------
    logit_gap : Improved OOD detection using logit gap.

    Notes
    -----
    MaxLogit is primarily a baseline comparator for richer logit-distribution
    methods such as :func:`logit_gap` [1].

    References
    ----------
    .. [1] Liang, Jiachen, et al.
       "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection."
       The Thirty-ninth Annual Conference on Neural Information Processing Systems.
       2025.

    Examples
    --------
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> max_logit(logits)
    array([5. , 2.1], dtype=float32)
    >>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
    >>> outlier_scores = max_logit(logits)  # doctest: +SKIP
    >>> print(outlier_scores)
    [5.0 2.1]
    """
    logits = _validate_logits(logits)
    return np.max(logits, axis=1)

Logit outlier scores¶

logit_gap(logits) ¶

max_logit(logits) ¶

`logit_gap(logits)` ¶

`max_logit(logits)` ¶