🚀 feat(model): Updated SuperSimpleNet to latest version (#3036)

blaz-r · rajeshgangireddy · web-flow · commit 045d116b4a42 · 2025-10-23T16:20:47.000+02:00
* Fix squeeze on 1dim score

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;

* Add option to train without masks

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;

* Add JIMS separate feat extension

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;

* Update docs for JIMS extension

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;

* Remove unused get_params method

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;

* Add unit tests for SSN

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;

* Rename vars and update metrics in readme

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;

---------

Signed-off-by: blaz.rolih &lt;blaz.rolih@fri.uni-lj.si&gt;
Co-authored-by: Rajesh Gangireddy &lt;rajesh.gangireddy@intel.com&gt;
diff --git a/docs/source/images/supersimplenet/architecture.png b/docs/source/images/supersimplenet/architecture.png
diff --git a/docs/source/markdown/guides/reference/models/image/index.md b/docs/source/markdown/guides/reference/models/image/index.md
@@ -113,7 +113,7 @@ Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection
 :link: ./supersimplenet
 :link-type: doc
 
-SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection
+SuperSimpleNet: A Unified Surface Defect Detection Model for all Supervision Regimes
 :::
 
 :::{grid-item-card} {material-regular}`model_training;1.5em` U-Flow
diff --git a/src/anomalib/models/image/supersimplenet/README.md b/src/anomalib/models/image/supersimplenet/README.md
@@ -1,6 +1,10 @@
-# SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection
+# SuperSimpleNet
 
-This is an implementation of the [SuperSimpleNet](https://arxiv.org/pdf/2408.03143) paper, based on the [official code](https://github.com/blaz-r/SuperSimpleNet).
+This is an implementation of the SuperSimpleNet, based on the [official code](https://github.com/blaz-r/SuperSimpleNet).
+
+The model was first presented at ICPR 2024: [SuperSimpleNet : Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection](https://arxiv.org/abs/2408.03143)
+
+An extension was later published in JIMS 2025: [No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes](https://link.springer.com/article/10.1007/s10845-025-02680-8)
 
 Model Type: Segmentation
 
@@ -11,7 +15,7 @@ feature extractor with upscaling, feature adaptor, feature-level synthetic anoma
 segmentation-detection module.
 
 A ResNet-like feature extractor first extracts features, which are then upscaled and
-average-pooled to capture neighboring context. Features are further refined for anomaly detection task in the adaptor module.
+average-pooled to capture neighboring context. Features are (optionally) further refined for anomaly detection task in the adaptor module.
 During training, synthetic anomalies are generated at the feature level by adding Gaussian noise to regions defined by the
 binary Perlin noise mask. The perturbed features are then fed into the segmentation-detection
 module, which produces the anomaly map and the anomaly score. During inference, anomaly generation is skipped, and the model
@@ -24,6 +28,9 @@ This implementation supports both unsupervised and supervised setting, but Anoma
 
 ![SuperSimpleNet architecture](/docs/source/images/supersimplenet/architecture.png "SuperSimpleNet architecture")
 
+Currently, the difference between ICPR and JIMS code is only the `adapt_cls_features` which controls whether the features used for classification head are adapted or not.
+For ICPR this is set to True (i.e. the features for classification head are adapted), and for JIMS version this is False (which is also the default).
+
 ## Usage
 
 `anomalib train --model SuperSimpleNet --data MVTecAD --data.category <category>`
@@ -36,29 +43,29 @@ This implementation supports both unsupervised and supervised setting, but Anoma
 >
 > It is recommended to train the model for 300 epochs with batch size of 32 to achieve stable training with random anomaly generation. Training with lower parameter values will still work, but might not yield the optimal results.
 >
-> For supervised learning, refer to the [official code](https://github.com/blaz-r/SuperSimpleNet).
+> For weakly, mixed and fully supervised training, refer to the [official code](https://github.com/blaz-r/SuperSimpleNet).
 
 ## MVTecAD AD results
 
 The following results were obtained using this Anomalib implementation trained for 300 epochs with seed 0, default params, and batch size 32.
 
-|            | **Image AUROC** | **Pixel AUPRO** |
-| ---------- | :-------------: | :-------------: |
-| Bottle     |      1.000      |      0.903      |
-| Cable      |      0.981      |      0.901      |
-| Capsule    |      0.989      |      0.931      |
-| Carpet     |      0.985      |      0.929      |
-| Grid       |      0.994      |      0.930      |
-| Hazelnut   |      0.994      |      0.943      |
-| Leather    |      1.000      |      0.970      |
-| Metal_nut  |      0.995      |      0.920      |
-| Pill       |      0.962      |      0.936      |
-| Screw      |      0.912      |      0.947      |
-| Tile       |      0.994      |      0.854      |
-| Toothbrush |      0.908      |      0.860      |
-| Transistor |      1.000      |      0.907      |
-| Wood       |      0.987      |      0.858      |
-| Zipper     |      0.995      |      0.928      |
-| Average    |      0.980      |      0.914      |
-
-For other results on VisA, SensumSODF, and KSDD2, refer to the [paper](https://arxiv.org/pdf/2408.03143).
+| Category    | AUROC (ICPR) | AUROC (JIMS) | AUPRO (ICPR) | AUPRO (JIMS) |
+| ----------- | :----------: | :----------: | :----------: | :----------: |
+| Bottle      |    1.000     |    1.000     |    0.903     |    0.911     |
+| Cable       |    0.981     |    0.951     |    0.901     |    0.893     |
+| Capsule     |    0.989     |    0.992     |    0.931     |    0.919     |
+| Carpet      |    0.985     |    0.974     |    0.929     |    0.935     |
+| Grid        |    0.994     |    0.998     |    0.930     |    0.938     |
+| Hazelnut    |    0.994     |    0.999     |    0.943     |    0.939     |
+| Leather     |    1.000     |    1.000     |    0.970     |    0.974     |
+| Metal_nut   |    0.995     |    0.993     |    0.920     |    0.925     |
+| Pill        |    0.962     |    0.980     |    0.936     |    0.943     |
+| Screw       |    0.912     |    0.854     |    0.947     |    0.946     |
+| Tile        |    0.994     |    0.992     |    0.854     |    0.825     |
+| Toothbrush  |    0.908     |    0.908     |    0.860     |    0.854     |
+| Transistor  |    1.000     |    1.000     |    0.907     |    0.916     |
+| Wood        |    0.987     |    0.991     |    0.858     |    0.872     |
+| Zipper      |    0.995     |    0.999     |    0.928     |    0.944     |
+| **Average** |  **0.980**   |  **0.975**   |  **0.914**   |  **0.916**   |
+
+For other results on VisA, SensumSODF, and KSDD2, refer to the [paper](https://link.springer.com/article/10.1007/s10845-025-02680-8).
diff --git a/src/anomalib/models/image/supersimplenet/anomaly_generator.py b/src/anomalib/models/image/supersimplenet/anomaly_generator.py
@@ -88,34 +88,39 @@ def generate_perlin(self, batches: int, height: int, width: int) -> torch.Tensor
 
     def forward(
         self,
-        features: torch.Tensor,
-        mask: torch.Tensor,
+        input_features: torch.Tensor | None,
+        adapted_features: torch.Tensor,
+        masks: torch.Tensor,
         labels: torch.Tensor,
-    ) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor]:
+    ) -> tuple[torch.Tensor, torch.Tensor, torch.Tensor, torch.Tensor]:
         """Generate anomaly on features using thresholded perlin noise and Gaussian noise.
 
         Also update GT masks and labels with new anomaly information.
 
         Args:
-            features (torch.Tensor): input features.
-            mask (torch.Tensor): GT masks.
+            input_features  (torch.Tensor): input features. Set to None if we only need adapted.
+            adapted_features (torch.Tensor): adapted input features.
+            masks (torch.Tensor): GT masks.
             labels (torch.Tensor): GT labels.
 
         Returns:
-            perturbed features, updated GT masks and labels.
+            perturbed features (if not None), perturbed adapted, updated GT masks and labels.
         """
-        b, _, h, w = features.shape
+        b, _, h, w = masks.shape
 
         # duplicate
-        features = torch.cat((features, features))
-        mask = torch.cat((mask, mask))
+        adapted_features = torch.cat((adapted_features, adapted_features))
+        mask = torch.cat((masks, masks))
         labels = torch.cat((labels, labels))
+        # extended ssn case where cls gets non-adapted
+        if input_features is not None:
+            input_features = torch.cat((input_features, input_features))
 
         noise = torch.normal(
             mean=self.noise_mean,
             std=self.noise_std,
-            size=features.shape,
-            device=features.device,
+            size=adapted_features.shape,
+            device=adapted_features.device,
             requires_grad=False,
         )
 
@@ -126,15 +131,15 @@ def forward(
             1,
             h,
             w,
-            device=features.device,
+            device=adapted_features.device,
             requires_grad=False,
         )
 
         # no overlap: don't apply to already anomalous regions (mask=1 -> bad)
         noise_mask = noise_mask * (1 - mask)
 
         # shape of noise is [B * 2, 1, H, W]
-        perlin_mask = self.generate_perlin(b * 2, h, w).to(features.device)
+        perlin_mask = self.generate_perlin(b * 2, h, w).to(adapted_features.device)
         # only apply where perlin mask is 1
         noise_mask = noise_mask * perlin_mask
 
@@ -150,6 +155,7 @@ def forward(
         labels = torch.where(labels > 0, torch.ones_like(labels), torch.zeros_like(labels))
 
         # apply masked noise
-        perturbed = features + noise * noise_mask
+        perturbed_adapt = adapted_features + noise * noise_mask
+        perturbed_feat = input_features + noise * noise_mask if input_features is not None else None
 
-        return perturbed, mask, labels
+        return perturbed_feat, perturbed_adapt, mask, labels
diff --git a/src/anomalib/models/image/supersimplenet/lightning_model.py b/src/anomalib/models/image/supersimplenet/lightning_model.py
@@ -1,7 +1,12 @@
-# Copyright (C) 2024 Intel Corporation
+# Copyright (C) 2024-2025 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0
 
-"""SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection.
+"""SuperSimpleNet.
+
+ICPR 2024 -
+SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection.
+
+JIMS 2025 - No Label Left Behind: A Unified Surface Defect Detection Model for all Supervision Regimes
 
 This module implements the SuperSimpleNet model for surface defect / anomaly detection.
 SuperSimpleNet is a simple yet strong discriminative model consisting of a pretrained feature extractor with upscaling,
@@ -25,9 +30,13 @@
 
 
 Paper:
-    Title: SuperSimpleNet: Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection.
+    Original: SuperSimpleNet:
+    Unifying Unsupervised and Supervised Learning for Fast and Reliable Surface Defect Detection.
     URL: https://arxiv.org/pdf/2408.03143
 
+    Extension: No label left behind: a unified surface defect detection model for all supervision regimes
+    URL: https://link.springer.com/article/10.1007/s10845-025-02680-8
+
 Notes:
     This implementation supports both unsupervised and supervised setting,
     but Anomalib currently supports only unsupervised learning.
@@ -64,6 +73,7 @@ class Supersimplenet(AnomalibModule):
         backbone (str): backbone name. IMPORTANT! use only backbones with torchvision V1 weights ending on ".tv".
         layers (list[str]): backbone layers utilised
         supervised (bool): whether the model will be trained in supervised mode. False by default (unsupervised).
+        adapt_cls_features (bool): whether to adapt classification features (ICPR - True, JIMS - False (default)).
         pre_processor (PreProcessor | bool, optional): Pre-processor instance or
             flag to use default. Defaults to ``True``.
         post_processor (PostProcessor | bool, optional): Post-processor instance
@@ -80,6 +90,7 @@ def __init__(
         backbone: str = "wide_resnet50_2.tv_in1k",  # IMPORTANT: use .tv weights, not tv2
         layers: list[str] = ["layer2", "layer3"],  # noqa: B006
         supervised: bool = False,
+        adapt_cls_features: bool = False,
         pre_processor: PreProcessor | bool = True,
         post_processor: PostProcessor | bool = True,
         evaluator: Evaluator | bool = True,
@@ -105,6 +116,7 @@ def __init__(
             backbone=backbone,
             layers=layers,
             stop_grad=stop_grad,
+            adapt_cls_features=adapt_cls_features,
         )
         self.loss = SSNLoss()
 
diff --git a/src/anomalib/models/image/supersimplenet/torch_model.py b/src/anomalib/models/image/supersimplenet/torch_model.py
@@ -4,7 +4,7 @@
 # SPDX-License-Identifier: MIT
 #
 # Modified
-# Copyright (C) 2024 Intel Corporation
+# Copyright (C) 2024-2025 Intel Corporation
 # SPDX-License-Identifier: Apache-2.0
 
 """PyTorch model for the SuperSimpleNet model implementation.
@@ -19,7 +19,6 @@
 import torch
 import torch.nn.functional as F  # noqa: N812
 from torch import nn
-from torch.nn import Parameter
 
 from anomalib.data import InferenceBatch
 from anomalib.models.components import GaussianBlur2d, TimmFeatureExtractor
@@ -36,6 +35,7 @@ class SupersimplenetModel(nn.Module):
         backbone (str): backbone name. IMPORTANT! use only backbones with torchvision V1 weights ending on ".tv".
         layers (list[str]): backbone layers utilised
         stop_grad (bool): whether to stop gradient from class. to seg. head.
+        adapt_cls_features (bool): whether to adapt classification features (ICPR - True, JIMS - False (default)).
     """
 
     def __init__(
@@ -44,11 +44,13 @@ def __init__(
         backbone: str = "wide_resnet50_2.tv_in1k",  # IMPORTANT: use .tv weights, not tv2
         layers: list[str] = ["layer2", "layer3"],  # noqa: B006
         stop_grad: bool = True,
+        adapt_cls_features: bool = False,
     ) -> None:
         super().__init__()
         self.feature_extractor = UpscalingFeatureExtractor(backbone=backbone, layers=layers)
 
         channels = self.feature_extractor.get_channels_dim()
+        self.adapt_cls_features = adapt_cls_features
         self.adaptor = FeatureAdapter(channels)
         self.segdec = SegmentationDetectionModule(channel_dim=channels, stop_grad=stop_grad)
         self.anomaly_generator = AnomalyGenerator(noise_mean=0, noise_std=0.015, threshold=perlin_threshold)
@@ -80,23 +82,52 @@ def forward(
         adapted = self.adaptor(features)
 
         if self.training:
-            masks = self.downsample_mask(masks, *features.shape[-2:])
+            if masks is None:
+                if labels is not None and labels.any():
+                    msg = "Training with anomalous samples without GT masks is currently not supported!"
+                    raise RuntimeError(msg)
+                b, _, h, w = features.shape
+                masks = torch.zeros((b, 1, h, w), dtype=torch.float32, device=features.device)
+            else:
+                masks = self.downsample_mask(masks, *features.shape[-2:])
             # make linter happy :)
             if labels is not None:
                 labels = labels.type(torch.float32)
 
-            features, masks, labels = self.anomaly_generator(
-                adapted,
-                masks,
-                labels,
-            )
-
-            anomaly_map, anomaly_score = self.segdec(features)
+            if self.adapt_cls_features:
+                # ICPR SuperSimpleNet - add noise to adapted only (since non-adapted are not used)
+                _, noised_adapt, masks, labels = self.anomaly_generator(
+                    input_features=None,
+                    adapted_features=adapted,
+                    masks=masks,
+                    labels=labels,
+                )
+                seg_feats = noised_adapt
+                cls_feats = noised_adapt
+            else:
+                # extension of SuperSimpleNet - add (same) noise to adapted and features
+                noised_feat, noised_adapt, masks, labels = self.anomaly_generator(
+                    input_features=features,
+                    adapted_features=adapted,
+                    masks=masks,
+                    labels=labels,
+                )
+                seg_feats = noised_adapt
+                cls_feats = noised_feat
+
+            anomaly_map, anomaly_score = self.segdec(seg_features=seg_feats, cls_features=cls_feats)
             return anomaly_map, anomaly_score, masks, labels
 
-        anomaly_map, anomaly_score = self.segdec(adapted)
+        seg_feats = adapted
+        # ICPR SuperSimpleNet - cls and seg both use adapted feat, JIMS extension SuperSimpleNet - adapt only seg feats
+        cls_feats = adapted if self.adapt_cls_features else features
+
+        anomaly_map, anomaly_score = self.segdec(seg_features=seg_feats, cls_features=cls_feats)
         anomaly_map = self.anomaly_map_generator(anomaly_map, final_size=output_size)
 
+        anomaly_score = anomaly_score.sigmoid()
+        anomaly_map = anomaly_map.sigmoid()
+
         return InferenceBatch(anomaly_map=anomaly_map, pred_score=anomaly_score)
 
     @staticmethod
@@ -296,33 +327,24 @@ def __init__(
 
         self.apply(init_weights)
 
-    def get_params(self) -> tuple[list[Parameter], list[Parameter]]:
-        """Get segmentation and classification head parameters.
-
-        Returns:
-            seg. head parameters and class. head parameters.
-        """
-        seg_params = list(self.seg_head.parameters())
-        dec_params = list(self.cls_conv.parameters()) + list(self.cls_fc.parameters())
-        return seg_params, dec_params
-
-    def forward(self, features: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
+    def forward(self, seg_features: torch.Tensor, cls_features: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
         """Predict anomaly map and anomaly score.
 
         Args:
-            features: adapted features.
+            seg_features: segmentation head features.
+            cls_features: classification head features.
 
         Returns:
             predicted anomaly map and score.
         """
         # get anomaly map from seg head
-        ano_map = self.seg_head(features)
+        ano_map = self.seg_head(seg_features)
 
         map_dec_copy = ano_map
         if self.stop_grad:
             map_dec_copy = map_dec_copy.detach()
         # dec conv layer takes feat + map
-        mask_cat = torch.cat((features, map_dec_copy), dim=1)
+        mask_cat = torch.cat((cls_features, map_dec_copy), dim=1)
         dec_out = self.cls_conv(mask_cat)
 
         # conv block result pooling
@@ -340,7 +362,7 @@ def forward(self, features: torch.Tensor) -> tuple[torch.Tensor, torch.Tensor]:
 
         # final dec layer: conv channel max and avg and map max and avg
         dec_cat = torch.cat((dec_max, dec_avg, map_max, map_avg), dim=1).squeeze()
-        ano_score = self.cls_fc(dec_cat).squeeze()
+        ano_score = self.cls_fc(dec_cat).reshape(-1)
 
         return ano_map, ano_score
 
diff --git a/tests/unit/models/image/supersimplenet/test_model.py b/tests/unit/models/image/supersimplenet/test_model.py