ood¶
Functions for Out-of-Distribution (OOD) detection.
These post-hoc OOD detection methods are for pre-trained supervised models, which leverage the information from the entire logit space to enhance in-distribution (ID) and out-of-distribution (OOD) separability.
References
Liang, J., Hou, R., Hu, M., Chang, H., Shan, S., & Chen, X. (2025). Revisiting Logit Distributions for Reliable Out-of-Distribution Detection. arXiv:2510.20134v1. https://arxiv.org/html/2510.20134v1
logit_gap(logits)
¶
LogitGap OOD detection score.
Computes the average gap between the maximum logit and remaining logits for each sample. This method leverages the observation that in-distribution (ID) samples tend to have higher maximum logits with lower non-maximum logits, while out-of-distribution (OOD) samples exhibit flatter logit distributions.
The scoring function is defined as:
.. math::
S_{\text{LogitGap}}(x; f) = \frac{1}{K-1}
\sum_{j=2}^{K} (z'_1 - z'_j)
where :math:z'_1 is the maximum logit and :math:z'_j are logits
sorted in descending order. Higher scores indicate higher confidence
for ID samples.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
logits
|
NDArray
|
Array of shape (n_samples, n_classes) containing raw logits from a pre-trained classification model. |
required |
Returns:
| Type | Description |
|---|---|
NDArray
|
Array of shape (n_samples,) containing OOD scores. Higher scores indicate higher likelihood of being in-distribution. |
See Also
max_logit : Simple baseline using only the maximum logit value.
Notes
The LogitGap method is motivated by the observation that ID samples exhibit more pronounced logit distributions (higher maximum logit with lower non-maximum logits), while OOD samples show flatter distributions (smaller gaps between logits).
The implementation computes the average gap efficiently by calculating the difference between the maximum logit and the mean of all other logits: max_logit - mean(other_logits). This is mathematically equivalent to the definition in Equation (4).
This method requires no additional training or calibration and can be applied as a post-hoc scoring function to any pre-trained classification model. It demonstrates superior performance compared to MaxLogit baseline across various benchmark datasets.
The function expects raw logits (pre-softmax values) rather than probabilities. Using logits directly preserves more information about the model's confidence distribution.
References
.. [1] Liang, Jiachen, et al. "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection." The Thirty-ninth Annual Conference on Neural Information Processing Systems. 2025.
Examples:
>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> scores = logit_gap(logits)
>>> print(scores)
[4.25 0.15]
Source code in src/samesame/ood.py
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 | |
max_logit(logits)
¶
MaxLogit OOD detection score (baseline method).
Computes the maximum logit value for each sample. This is a simple baseline that uses only the most confident prediction, disregarding information from other classes.
The scoring function is defined as:
.. math::
S_{\text{MaxLogit}}(x; f) = \max_k z_k
where :math:z_k is the logit for class k. Higher scores indicate
higher confidence for the predicted class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
logits
|
NDArray
|
Array of shape (n_samples, n_classes) containing raw logits from a pre-trained classification model. |
required |
Returns:
| Type | Description |
|---|---|
NDArray
|
Array of shape (n_samples,) containing OOD scores (maximum logits). Higher scores indicate higher confidence, but with limited discriminative power for OOD detection compared to LogitGap. |
See Also
logit_gap : Improved OOD detection using logit gap.
Notes
MaxLogit is included as a baseline for comparison. The paper demonstrates that LogitGap achieves significantly better OOD detection performance by leveraging information from all logits rather than just the maximum.
This method has several limitations: - It only uses information from the top predicted class - It ignores the distribution of non-maximum logits - It provides limited discriminative power between ID and OOD samples
MaxLogit is conceptually similar to Maximum Softmax Probability (MSP) but operates directly on logits rather than probabilities. Both methods are widely used baselines in OOD detection literature.
Despite its simplicity, MaxLogit serves as a reasonable baseline and requires no additional computation beyond extracting the maximum value from the logit vector. It can be useful for computational efficiency when more sophisticated methods are not necessary.
References
.. [1] Liang, Jiachen, et al. "Revisiting Logit Distributions for Reliable Out-of-Distribution Detection." The Thirty-ninth Annual Conference on Neural Information Processing Systems. 2025.
Examples:
>>> logits = np.array([[5.0, 1.0, 0.5], [2.0, 2.1, 1.9]])
>>> scores = max_logit(logits)
>>> print(scores)
[5.0 2.1]
Source code in src/samesame/ood.py
129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 | |