-
Notifications
You must be signed in to change notification settings - Fork 47
Description
Video Explainability:
Video Explainability aims to clarify how YOLO models detect and interpret objects across sequences of video frames, enhancing transparency and trust. It provides insights into the decision-making process, with natural language explanations making the model’s reasoning more accessible.
Why it is required :
For image based application, We do have some Reasoning and evaluation metrices to understand the AI output. The similar kind of explainability features needs to be derived for the Video.
• Video needs to be extracted as individual Image frames
• The reasoning, verification and evaluation features and object detection results verification results need to be populated for every frames.
• Redundant frames can be ignored but the explanation data should be coupled back to original video like a subtitle.
• The major issue on this feature is a latency.
• Need to have some streaming API that simultaneously assess each frames and process data and provides insights in textual manner.
Pipeline Activities Forecasted:
- Expanding Task Support: Future enhancements will focus on extending support beyond detection to include classification, segmentation, tracking, pose estimation, and Oriented Bounding Box (OBB) detection, enabling a more comprehensive object detection system.
- Broader Framework Compatibility: Efforts will be made to resolve configuration constraints and expand support for additional object detection frameworks, such as Faster R-CNN, RetinaNet, EfficientDet, and Detectron2, beyond the currently supported YOLOv8 and YOLOv5su models.