Release supervision-0.26.0 · roboflow/supervision

Warning

supervision-0.26.0 drops python3.8 support and upgrade all codes to python3.9 syntax style.

Tip

Our docs page now has a fresh look that is consistent with the documentations of all Roboflow open-source projects. (#1858)

🚀 Added

Added support for creating sv.KeyPoints objects from ViTPose and ViTPose++ inference results via sv.KeyPoints.from_transformers. (#1788)

vitpose-plus-large.mp4

Added support for the IOS (Intersection over Smallest) overlap metric that measures how much of the smaller object is covered by the larger one in sv.Detections.with_nms, sv.Detections.with_nmm, sv.box_iou_batch, and sv.mask_iou_batch. (#1774)

import numpy as np
import supervision as sv

boxes_true = np.array([
    [100, 100, 200, 200],
    [300, 300, 400, 400]
])
boxes_detection = np.array([
    [150, 150, 250, 250],
    [320, 320, 420, 420]
])

sv.box_iou_batch(
    boxes_true=boxes_true, 
    boxes_detection=boxes_detection, 
    overlap_metric=sv.OverlapMetric.IOU
)

# array([[0.14285714, 0.        ],
#        [0.        , 0.47058824]])

sv.box_iou_batch(
    boxes_true=boxes_true, 
    boxes_detection=boxes_detection, 
    overlap_metric=sv.OverlapMetric.IOS
)

# array([[0.25, 0.  ],
#        [0.  , 0.64]])

Added sv.box_iou that efficiently computes the Intersection over Union (IoU) between two individual bounding boxes. (#1874)
Added support for frame limitations and progress bar in sv.process_video. (#1816)
Added sv.xyxy_to_xcycarh function to convert bounding box coordinates from (x_min, y_min, x_max, y_max) into measurement space to format (center x, center y, aspect ratio, height), where the aspect ratio is width / height. (#1823)
Added sv.xyxy_to_xywh function to convert bounding box coordinates from (x_min, y_min, x_max, y_max) format to (x, y, width, height) format. (#1788)

🌱 Changed

sv.LabelAnnotator now supports the smart_position parameter to automatically keep labels within frame boundaries, and the max_line_length parameter to control text wrapping for long or multi-line labels. (#1820)

supervision-0.26.0-2.mp4
sv.LabelAnnotator now supports non-string labels. (#1825)

sv.Detections.from_vlm now supports parsing bounding boxes and segmentation masks from responses generated by Google Gemini models. You can test Gemini prompting, result parsing, and visualization with Supervision using this example notebook. (#1792)

 import supervision as sv

 gemini_response_text = """```json
     [
         {"box_2d": [543, 40, 728, 200], "label": "cat", "id": 1},
         {"box_2d": [653, 352, 820, 522], "label": "dog", "id": 2}
     ]
 ```"""

 detections = sv.Detections.from_vlm(
     sv.VLM.GOOGLE_GEMINI_2_5,
     gemini_response_text,
     resolution_wh=(1000, 1000),
     classes=['cat', 'dog'],
 )

 detections.xyxy
 # array([[543., 40., 728., 200.], [653., 352., 820., 522.]])
 
 detections.data
 # {'class_name': array(['cat', 'dog'], dtype='<U26')}
 
 detections.class_id
 # array([0, 1])

sv.Detections.from_vlm now supports parsing bounding boxes from responses generated by Moondream. (#1878)

import supervision as sv

moondream_result = {
    'objects': [
        {
            'x_min': 0.5704046934843063,
            'y_min': 0.20069346576929092,
            'x_max': 0.7049859315156937,
            'y_max': 0.3012596592307091
        },
        {
            'x_min': 0.6210969910025597,
            'y_min': 0.3300672620534897,
            'x_max': 0.8417936339974403,
            'y_max': 0.4961046129465103
        }
    ]
}

detections = sv.Detections.from_vlm(
    sv.VLM.MOONDREAM,
    moondream_result,
    resolution_wh=(1000, 1000),
)

detections.xyxy
# array([[1752.28,  818.82, 2165.72, 1229.14],
#        [1908.01, 1346.67, 2585.99, 2024.11]])

sv.Detections.from_vlm now supports parsing bounding boxes from responses generated by Qwen-2.5 VL. You can test Qwen2.5-VL prompting, result parsing, and visualization with Supervision using this example notebook. (#1709)

import supervision as sv

qwen_2_5_vl_result = """```json
[
    {"bbox_2d": [139, 768, 315, 954], "label": "cat"},
    {"bbox_2d": [366, 679, 536, 849], "label": "dog"}
]
```"""

detections = sv.Detections.from_vlm(
    sv.VLM.QWEN_2_5_VL,
    qwen_2_5_vl_result,
    input_wh=(1000, 1000),
    resolution_wh=(1000, 1000),
    classes=['cat', 'dog'],
)

detections.xyxy
# array([[139., 768., 315., 954.], [366., 679., 536., 849.]])

detections.class_id
# array([0, 1])

detections.data
# {'class_name': array(['cat', 'dog'], dtype='<U10')}

detections.class_id
# array([0, 1])

Significantly improved the speed of HSV color mapping in sv.HeatMapAnnotator, achieving approximately 28x faster performance on 1920x1080 frames. (#1786)

🔧 Fixed

Supervision’s sv.MeanAveragePrecision is now fully aligned with pycocotools, the official COCO evaluation tool, ensuring accurate and standardized metrics. (#1834)

import supervision as sv
from supervision.metrics import MeanAveragePrecision

predictions = sv.Detections(...)
targets = sv.Detections(...)

map_metric = MeanAveragePrecision()
map_metric.update(predictions, targets).compute()

# Average Precision (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.464
# Average Precision (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.637
# Average Precision (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.203
# Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.284
# Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.497
# Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.629

Tip

The updated mAP implementation enabled us to build an updated version of the Computer Vision Model Leaderboard.

Fix #1767: Fixed losing sv.Detections.data when detections filtering.

⚠️ Deprecated

sv.LMM enum is deprecated and will be removed in supervision-0.31.0. Use sv.VLM instead.
sv.Detections.from_lmm property is deprecated and will be removed in supervision-0.31.0. Use sv.Detections.from_vlm instead.

❌ Removed

The sv.DetectionDataset.images property has been removed in supervision-0.26.0. Please loop over images with for path, image, annotation in dataset:, as that does not require loading all images into memory.
Cconstructing sv.DetectionDataset with parameter images as Dict[str, np.ndarray] is deprecated and has been removed in supervision-0.26.0. Please pass a list of paths List[str] instead.
The name sv.BoundingBoxAnnotator is deprecated and has been removed in supervision-0.26.0. It has been renamed to sv.BoxAnnotator.

🏆 Contributors

@onuralpszr (Onuralp SEZER), @SkalskiP (Piotr Skalski), @SunHao-AI (Hao Sun), @rafaelpadilla Rafael Padilla, @Ashp116 (Ashp116), @capjamesg (James Gallagher), @blakeburch (Blake Burch), @hidara2000 (hidara2000), @Armaggheddon (Alessandro Brunello), @soumik12345 (Soumik Rakshit).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

supervision-0.26.0