Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
050928c
:fire: Remove baseline code
o-laurent Jul 29, 2025
e6ed38e
:white_check_mark: Adapt test
o-laurent Jul 29, 2025
9af8282
:book: Add one paper using TU
o-laurent Jul 31, 2025
7582564
:book: Add one paper using TU
o-laurent Jul 31, 2025
552a5c3
:lipstick: Show val NLL in probabilistic regression
o-laurent Jul 31, 2025
bfc2bcd
📚 Remove MCDropout class from docs
082T Aug 1, 2025
db241be
Merge pull request #212 from 082T/autosummary-mcdropout-warning-fix
o-laurent Aug 1, 2025
a528da3
:lipstick: Small improvement for the EMA
o-laurent Aug 2, 2025
8aebd5f
:shirt: Format
o-laurent Aug 2, 2025
ecb4729
:sparkles: Add support for the Gamma distribution
o-laurent Aug 3, 2025
a441d2d
:bug: Add forgotten mlps to all
o-laurent Aug 5, 2025
d4a4f6e
:lipstick: remove output plot type
o-laurent Aug 15, 2025
bbfdf2a
Merge branch 'dev' of github.com:ENSTA-U2IS-AI/torch-uncertainty into…
alafage Aug 22, 2025
a95b64a
:white_check_mark: Fix CalibrationError failure test
alafage Aug 22, 2025
f5564e8
📚 Fix cropped plot axis text (#55)
082T Aug 7, 2025
13a4e4f
Merge pull request #216 from 082T/docs-plot-tight-layout
alafage Aug 25, 2025
1937ca4
:wrench: Attempt fixing `Server certificate verification failed`
alafage Aug 25, 2025
d8b8621
Merge branch 'dev' of github.com:ENSTA-U2IS-AI/torch-uncertainty into…
alafage Aug 25, 2025
8587f53
:bug: Temporary fix for UCI dataset url domain not working anymore
alafage Aug 25, 2025
2255238
:wrench: update build-doc workflow file
alafage Aug 25, 2025
aa27f6c
:art: Rename attribute `model` to `core_model` in all wrappers
alafage Aug 26, 2025
bd0d1c6
:wrench: Update CIFAR10 experiments (script + config files)
alafage Aug 26, 2025
ebb43c5
Merge branch 'dev' into rm-baselines
alafage Aug 26, 2025
b3e883f
:art: Fix outdated config files for all experiments
alafage Sep 1, 2025
253bcfa
:hammer: Make title and labels customizable in `CalibrationError.plot()`
alafage Sep 22, 2025
ba1af87
:bug: Fix image download in `tutorial_corruption.py`
alafage Sep 22, 2025
0477de9
:bug: Fix `GammaConvNd` and add tests for Gamma distribution layers
alafage Sep 22, 2025
c058f8d
:white_check_mark: Improve coverage
alafage Sep 22, 2025
ddda6db
:hammer: Update segmentation experiment config files
alafage Oct 2, 2025
ee3fbae
:white_check_mark: Improve `deep_ensembles` test coverage
alafage Oct 6, 2025
bf0f8c6
Merge pull request #196 from ENSTA-U2IS-AI/rm-baselines
alafage Oct 6, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/build-docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ env:

jobs:
documentation:
runs-on: self-hosted
runs-on: [self-hosted]
steps:
- uses: actions/checkout@v4

Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,7 @@ Check out all our tutorials at [torch-uncertainty.github.io/auto_tutorials](http

The following projects use TorchUncertainty:

- _Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation_ - [CVPR 2025](https://openaccess.thecvf.com/content/CVPR2025/papers/Franchi_Towards_Understanding_and_Quantifying_Uncertainty_for_Text-to-Image_Generation_CVPR_2025_paper.pdf)
- _Towards Understanding Why Label Smoothing Degrades Selective Classification and How to Fix It_ - [ICLR 2025](https://arxiv.org/abs/2403.14715)
- _A Symmetry-Aware Exploration of Bayesian Neural Network Posteriors_ - [ICLR 2024](https://arxiv.org/abs/2310.08287)

Expand Down
3 changes: 1 addition & 2 deletions auto_tutorial_source/Bayesian_Methods/tutorial_bayesian.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,7 @@
For more information on Bayesian Neural Networks, we refer to the following resources:

- Weight Uncertainty in Neural Networks `ICML2015 <https://arxiv.org/pdf/1505.05424.pdf>`_
- Hands-on Bayesian Neural Networks - a Tutorial for Deep Learning Users `IEEE Computational Intelligence Magazine
<https://arxiv.org/pdf/2007.06823.pdf>`_
- Hands-on Bayesian Neural Networks - a Tutorial for Deep Learning Users `IEEE Computational Intelligence Magazine <https://arxiv.org/pdf/2007.06823.pdf>`_

Training a Bayesian LeNet using TorchUncertainty models and Lightning
---------------------------------------------------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -146,4 +146,5 @@
# ----------
#
# [1] Hendrycks, D., & Gimpel, K. (2016). A baseline for detecting misclassified and out-of-distribution examples in neural networks. In ICLR 2017.
#
# [2] Hendrycks, D., Basart, S., Mazeika, M., Zou, A., Kwon, J., Mostajabi, M., ... & Song, D. (2019). Scaling out-of-distribution detection for real-world settings. In ICML 2022.
17 changes: 15 additions & 2 deletions auto_tutorial_source/Data_Augmentation/tutorial_corruption.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
"""

# %%
from pathlib import Path
from urllib import request

import matplotlib.pyplot as plt
Expand All @@ -30,8 +31,20 @@


def download_img(url, i):
request.urlretrieve(url, f"tmp_{i}.png") # noqa: S310
return Image.open(f"tmp_{i}.png").convert("RGB")
# Create a request with proper headers to avoid 403 Forbidden error
if not url.startswith(("http:", "https:")):
raise ValueError("URL must start with 'http:' or 'https:'")

req = request.Request( # noqa: S310
url,
headers={
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
},
)
filename = Path(f"tmp_{i}.png")
with request.urlopen(req) as response, filename.open("wb") as f: # noqa: S310
f.write(response.read())
return Image.open(filename).convert("RGB")


images_ds = [download_img(url, i) for i, url in enumerate(urls)]
Expand Down
14 changes: 12 additions & 2 deletions auto_tutorial_source/Ensemble_Methods/tutorial_from_de_to_pe.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,11 @@
Improved Ensemble parameter-efficiency with Packed-Ensembles
============================================================

*This tutorial is adapted from a notebook part of a lecture given at the `Helmholtz AI Conference <https://haicon24.de/>`_ by Sebastian Starke, Peter Steinbach, Gianni Franchi, and Olivier Laurent.*
*This tutorial is adapted from a notebook part of a lecture given at the* |conference|_ *by Sebastian Starke, Peter Steinbach, Gianni Franchi, and Olivier Laurent.*

.. _conference: https://haicon24.de/

.. |conference| replace:: *Helmholtz AI Conference*

In this notebook will work on the MNIST dataset that was introduced by Corinna Cortes, Christopher J.C. Burges, and later modified by Yann LeCun in the foundational paper:

Expand All @@ -12,6 +16,7 @@
The MNIST dataset consists of 70 000 images of handwritten digits from 0 to 9. The images are grayscale and 28x28-pixel sized. The task is to classify the images into their respective digits. The dataset can be automatically downloaded using the `torchvision` library.

In this notebook, we will train a model and an ensemble on this task and evaluate their performance. The performance will consist in the following metrics:

- Accuracy: the proportion of correctly classified images,
- Brier score: a measure of the quality of the predicted probabilities,
- Calibration error: a measure of the calibration of the predicted probabilities,
Expand Down Expand Up @@ -174,13 +179,16 @@ def optim_recipe(model, lr_mult: float = 1.0):
# This table provides a lot of information:
#
# **OOD Detection: Binary Classification MNIST vs. FashionMNIST**
#
# - AUPR/AUROC/FPR95: Measures the quality of the OOD detection. The higher the better for AUPR and AUROC, the lower the better for FPR95.
#
# **Calibration: Reliability of the Predictions**
#
# - ECE: Expected Calibration Error. The lower the better.
# - aECE: Adaptive Expected Calibration Error. The lower the better. (~More precise version of the ECE)
#
# **Classification Performance**
#
# - Accuracy: The ratio of correctly classified images. The higher the better.
# - Brier: The quality of the predicted probabilities (Mean Squared Error of the predictions vs. ground-truth). The lower the better.
# - Negative Log-Likelihood: The value of the loss on the test set. The lower the better.
Expand Down Expand Up @@ -236,7 +244,7 @@ def optim_recipe(model, lr_mult: float = 1.0):
# We need to multiply the learning rate by 2 to account for the fact that we have 2 models
# in the ensemble and that we average the loss over all the predictions.
#
# #### Downloading the pre-trained models
# **Downloading the pre-trained models**
#
# We have put the pre-trained models on Hugging Face that you can download with the utility function
# "hf_hub_download" imported just below. These models are trained for 75 epochs and are therefore not
Expand Down Expand Up @@ -393,9 +401,11 @@ def forward(self, x: torch.Tensor) -> torch.Tensor:
# In constrast to calibration, the values of the confidence scores are not important, only the order of the scores. *Ideally, the best model will order all the correct predictions first, and all the incorrect predictions last.* In this case, there will be a threshold so that all the predictions above the threshold are correct, and all the predictions below the threshold are incorrect.
#
# In TorchUncertainty, we look at 3 different metrics for selective classification:
#
# - **AURC**: The area under the Risk (% of errors) vs. Coverage (% of classified samples) curve. This curve expresses how the risk of the model evolves as we increase the coverage (the proportion of predictions that are above the selection threshold). This metric will be minimized by a model able to perfectly separate the correct and incorrect predictions.
#
# The following metrics are computed at a fixed risk and coverage level and that have practical interests. The idea of these metrics is that you can set the selection threshold to achieve a certain level of risk and coverage, as required by the technical constraints of your application:
#
# - **Coverage at 5% Risk**: The proportion of predictions that are above the selection threshold when it is set for the risk to egal 5%. Set the risk threshold to your application constraints. The higher the better.
# - **Risk at 80% Coverage**: The proportion of errors when the coverage is set to 80%. Set the coverage threshold to your application constraints. The lower the better.
#
Expand Down
2 changes: 2 additions & 0 deletions auto_tutorial_source/Post_Hoc_Methods/tutorial_scaler.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,7 @@
# We also compute and plot the top-label calibration figure. We see that the
# model is not well calibrated.
fig, ax = ece.plot()
fig.tight_layout()
fig.show()

# %%
Expand Down Expand Up @@ -143,6 +144,7 @@
# that the model is now better calibrated. If the temperature is greater than 1,
# the final model is less confident than before.
fig, ax = ece.plot()
fig.tight_layout()
fig.show()

# %%
Expand Down
63 changes: 0 additions & 63 deletions docs/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -52,68 +52,6 @@ Pixelwise Regression

PixelRegressionRoutine

Baselines
---------

.. warning::

The baselines will soon be removed from the library to avoid confusion with the routines.

TorchUncertainty provide lightning-based models that can be easily trained and evaluated.
These models inherit from the routines and are specifically designed to benchmark
different methods in similar settings, here with constant architectures.

.. currentmodule:: torch_uncertainty.baselines.classification

Classification
^^^^^^^^^^^^^^

.. autosummary::
:toctree: generated/
:nosignatures:
:template: class.rst

ResNetBaseline
VGGBaseline
WideResNetBaseline

.. currentmodule:: torch_uncertainty.baselines.regression

Regression
^^^^^^^^^^

.. autosummary::
:toctree: generated/
:nosignatures:
:template: class.rst

MLPBaseline

.. currentmodule:: torch_uncertainty.baselines.segmentation

Segmentation
^^^^^^^^^^^^

.. autosummary::
:toctree: generated/
:nosignatures:
:template: class.rst

DeepLabBaseline
SegFormerBaseline

.. currentmodule:: torch_uncertainty.baselines.depth

Monocular Depth Estimation
^^^^^^^^^^^^^^^^^^^^^^^^^^

.. autosummary::
:toctree: generated/
:nosignatures:
:template: class.rst

BTSBaseline

Layers
------

Expand Down Expand Up @@ -221,7 +159,6 @@ Classes
BatchEnsemble
CheckpointCollector
EMA
MCDropout
StochasticModel
SWA
SWAG
Expand Down
4 changes: 4 additions & 0 deletions docs/source/cli_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@ CLI Guide
Introduction to the Lightning CLI
---------------------------------

.. warning::

Deprecated: This guide needs to be updated to reflect the latest changes (removal of the torch_uncertainty.baselines module, etc.)

The Lightning CLI tool eases the implementation of a CLI to instanciate models to train and evaluate them on
some data. The CLI tool is a wrapper around the ``Trainer`` class and provides a set of subcommands to train
and test a ``LightningModule`` on a ``LightningDataModule``. To better match our needs, we created an inherited
Expand Down
30 changes: 0 additions & 30 deletions experiments/classification/cifar10/configs/resnet.yaml

This file was deleted.

37 changes: 24 additions & 13 deletions experiments/classification/cifar10/configs/resnet18/batched.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
# lightning.pytorch==2.1.3
seed_everything: false
eval_after_fit: true
trainer:
Expand All @@ -23,22 +22,34 @@ trainer:
patience: 1000
check_finite: true
model:
model:
class_path: torch_uncertainty.models.classification.batched_resnet
init_args:
in_channels: 3
num_classes: 10
arch: 18
num_estimators: 4
style: cifar
num_classes: 10
in_channels: 3
loss: CrossEntropyLoss
version: batched
arch: 18
style: cifar
num_estimators: 4
is_ensemble: true
format_batch_fn:
class_path: torch_uncertainty.transforms.RepeatTarget
init_args:
num_repeats: 4
data:
root: ./data
batch_size: 128
optimizer:
lr: 0.05
momentum: 0.9
weight_decay: 5e-4
class_path: torch.optim.SGD
init_args:
lr: 0.05
momentum: 0.9
weight_decay: 5e-4
lr_scheduler:
milestones:
- 25
- 50
gamma: 0.1
class_path: torch.optim.lr_scheduler.MultiStepLR
init_args:
milestones:
- 25
- 50
gamma: 0.1
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# lightning.pytorch==2.1.3
seed_everything: false
eval_after_fit: true
trainer:
accelerator: gpu
devices: 1
precision: 16-mixed
max_epochs: 75
logger:
class_path: lightning.pytorch.loggers.TensorBoardLogger
init_args:
save_dir: logs/resnet18
name: deep_ensembles
default_hp_metric: false
callbacks:
- class_path: torch_uncertainty.callbacks.TUClsCheckpoint
- class_path: lightning.pytorch.callbacks.LearningRateMonitor
init_args:
logging_interval: step
- class_path: lightning.pytorch.callbacks.EarlyStopping
init_args:
monitor: val/cls/Acc
patience: 1000
check_finite: true
model:
model:
class_path: torch_uncertainty.models.deep_ensembles
init_args:
core_models:
class_path: torch_uncertainty.models.classification.resnet
init_args:
in_channels: 3
num_classes: 10
arch: 18
style: cifar
num_estimators: 4
task: classification
# eventually you can pass the checkpoints of standard resnet18 models here
# ckpt_paths: [path/to/ckpt1, path/to/ckpt2, path/to/ckpt3, path/to/ckpt4]
num_classes: 10
loss: CrossEntropyLoss
is_ensemble: true
format_batch_fn:
class_path: torch_uncertainty.transforms.RepeatTarget
init_args:
num_repeats: 4
data:
root: ./data
batch_size: 128
optimizer:
class_path: torch.optim.SGD
init_args:
lr: 0.2 # initial learning rate times 4 (num_estimators)
momentum: 0.9
weight_decay: 5e-4
lr_scheduler:
class_path: torch.optim.lr_scheduler.MultiStepLR
init_args:
milestones:
- 25
- 50
gamma: 0.1
Loading