From 1bcb03c44f74bbcd2979e30598a06da9c1f07a3c Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:30:42 +1000 Subject: [PATCH 01/14] Update README.md --- README.md | 144 ++++++++++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 129 insertions(+), 15 deletions(-) diff --git a/README.md b/README.md index 3a10f6515..bc8679900 100644 --- a/README.md +++ b/README.md @@ -1,19 +1,133 @@ -# Pattern Analysis -Pattern Analysis of various datasets by COMP3710 students in 2024 at the University of Queensland. +# YOLO11 Melanoma Detection -We create pattern recognition and image processing library for Tensorflow (TF), PyTorch or JAX. +This repository provides instructions and code for training and deploying a YOLO11 model for melanoma detection, using the ISIC dataset or any custom melanoma dataset. -This library is created and maintained by The University of Queensland [COMP3710](https://my.uq.edu.au/programs-courses/course.html?course_code=comp3710) students. +## Table of Contents +- [Introduction](#introduction) +- [Requirements](#requirements) +- [Dataset Preparation](#dataset-preparation) +- [Configuration](#configuration) +- [Training](#training) +- [Validation](#validation) +- [Inference](#inference) +- [Exporting the Model](#exporting-the-model) -The library includes the following implemented in Tensorflow: -* fractals -* recognition problems +## Introduction -In the recognition folder, you will find many recognition problems solved including: -* segmentation -* classification -* graph neural networks -* StyleGAN -* Stable diffusion -* transformers -etc. +Melanoma detection using deep learning techniques can aid early diagnosis and reduce mortality. YOLO11, the latest version of the YOLO model by Ultralytics, is a fast and accurate model suitable for melanoma detection in medical images. + +This project fine-tunes YOLO11 on a melanoma dataset to classify and localize skin lesions as "melanoma" or "benign". + +## Requirements + +Install the necessary libraries: +```bash +pip install ultralytics +``` + +## Dataset Preparation + +1. **Download the Dataset**: Download the ISIC dataset from the [ISIC Archive](https://www.isic-archive.com/). +2. **Organize the Data**: Arrange your dataset in the following structure: + ``` + /datasets/melanoma + ├── images + │ ├── train + │ │ ├── image1.jpg + │ │ ├── image2.jpg + │ │ └── ... + │ └── val + │ ├── image1.jpg + │ ├── image2.jpg + │ └── ... + └── labels + ├── train + │ ├── image1.txt + │ ├── image2.txt + │ └── ... + └── val + ├── image1.txt + ├── image2.txt + └── ... + ``` +3. **Label Format**: Each `.txt` label file should contain one line per bounding box, in YOLO format: + ``` + + ``` + - `class_id`: `0` for melanoma, `1` for benign. + - ``, ``, ``, and `` should be normalized by image width and height. + +## Configuration + +Create a YAML file named `melanoma.yaml` to specify the dataset for YOLO training: + +```yaml +# melanoma.yaml +path: /content/datasets/melanoma # Dataset root directory +train: images/train # Train images folder +val: images/val # Validation images folder + +names: + 0: melanoma + 1: benign +``` + +## Training + +To train YOLO11 on the melanoma dataset, use the following script in Python: + +```python +from ultralytics import YOLO + +# Load the pre-trained YOLO11 model +model = YOLO('yolo11n.pt') # Load a lightweight version; options include yolo11s.pt, etc. + +# Train the model +model.train(data='melanoma.yaml', epochs=50, imgsz=640) # Modify epochs and image size as needed +``` + +This will fine-tune the YOLO11 model on your melanoma dataset for 50 epochs. + +## Validation + +After training, evaluate the model’s performance using the validation set: + +```python +# Validate the model +results = model.val() +``` + +The validation metrics, including mAP (mean Average Precision), precision, and recall, will be displayed to help gauge model performance. + +## Inference + +To run inference on new images, use the following code: + +```python +# Run inference on an image +results = model('/path/to/sample/image.jpg') +results.show() # Display results with bounding boxes and class labels +``` + +The model will output bounding boxes around detected lesions with classifications as "melanoma" or "benign." + +## Exporting the Model + +You can export the model for deployment in different formats like ONNX, TensorFlow, and TensorRT. + +```python +# Export to ONNX format +model.export(format='onnx') +``` + +Supported formats include `torchscript`, `onnx`, `openvino`, `tflite`, and more. Refer to the [Ultralytics documentation](https://docs.ultralytics.com/modes/export) for further details. + +## Notes + +- Adjust `epochs`, `batch_size`, and `imgsz` based on dataset size and hardware capabilities. +- Fine-tuning larger models like `yolo11s.pt` may yield better results but will require more computational resources. + +## Acknowledgments + +- This project utilizes the YOLO11 model from [Ultralytics](https://github.com/ultralytics/ultralytics). +- ISIC dataset provided by the [ISIC Archive](https://www.isic-archive.com/). From 8828d741a9e5ac0ca47eef0c8b5dd3dac8d78330 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:37:36 +1000 Subject: [PATCH 02/14] Update README.md --- README.md | 172 +++++++++++++++++++++++++----------------------------- 1 file changed, 79 insertions(+), 93 deletions(-) diff --git a/README.md b/README.md index bc8679900..08de61f66 100644 --- a/README.md +++ b/README.md @@ -1,133 +1,119 @@ -# YOLO11 Melanoma Detection +# Melanoma Detection using YOLO11 -This repository provides instructions and code for training and deploying a YOLO11 model for melanoma detection, using the ISIC dataset or any custom melanoma dataset. +## Overview -## Table of Contents -- [Introduction](#introduction) -- [Requirements](#requirements) -- [Dataset Preparation](#dataset-preparation) -- [Configuration](#configuration) -- [Training](#training) -- [Validation](#validation) -- [Inference](#inference) -- [Exporting the Model](#exporting-the-model) +Melanoma is one of the most aggressive forms of skin cancer, and early detection significantly increases survival rates. This project leverages the YOLO11 (You Only Look Once) deep learning algorithm by Ultralytics to automatically detect melanoma in dermoscopic images. YOLO11 is a cutting-edge object detection model that can detect multiple objects within an image in real time. This project adapts YOLO11 for binary classification of skin lesions as either *melanoma* or *benign*, making it a powerful tool for aiding in early skin cancer diagnosis. The project detects lesions within the ISIC 2017/8 data set with all detections having a minimum Intersection Over Union of 0.8 on the test set and a suitable accuracy for classification. -## Introduction -Melanoma detection using deep learning techniques can aid early diagnosis and reduce mortality. YOLO11, the latest version of the YOLO model by Ultralytics, is a fast and accurate model suitable for melanoma detection in medical images. -This project fine-tunes YOLO11 on a melanoma dataset to classify and localize skin lesions as "melanoma" or "benign". -## Requirements -Install the necessary libraries: +*Figure: Sample output of YOLO11 detecting melanoma in a dermoscopic image* + +## How it Works + +YOLO11 is a single-stage object detection model that processes the entire image in a single forward pass, predicting bounding boxes and classification scores simultaneously. It divides the input image into a grid, with each grid cell responsible for detecting an object within its bounds. Using anchor boxes, the model generates bounding box coordinates and confidence scores, optimized for melanoma detection by training on a labeled dataset of dermoscopic images. The final model can localize and classify skin lesions as either melanoma or benign in real time. + +## Dependencies + +To run this project, the following dependencies are required: + +- **Python**: 3.10 +- **Ultralytics**: 8.3.2 (includes YOLO11) +- **PyTorch**: 2.4.1+cu121 +- **OpenCV**: 4.5.3 +- **Matplotlib**: 3.4.2 + +Ensure you install the dependencies via: ```bash -pip install ultralytics +pip install ultralytics opencv-python-headless matplotlib ``` -## Dataset Preparation - -1. **Download the Dataset**: Download the ISIC dataset from the [ISIC Archive](https://www.isic-archive.com/). -2. **Organize the Data**: Arrange your dataset in the following structure: - ``` - /datasets/melanoma - ├── images - │ ├── train - │ │ ├── image1.jpg - │ │ ├── image2.jpg - │ │ └── ... - │ └── val - │ ├── image1.jpg - │ ├── image2.jpg - │ └── ... - └── labels - ├── train - │ ├── image1.txt - │ ├── image2.txt - │ └── ... - └── val - ├── image1.txt - ├── image2.txt - └── ... - ``` -3. **Label Format**: Each `.txt` label file should contain one line per bounding box, in YOLO format: - ``` - - ``` - - `class_id`: `0` for melanoma, `1` for benign. - - ``, ``, ``, and `` should be normalized by image width and height. - -## Configuration - -Create a YAML file named `melanoma.yaml` to specify the dataset for YOLO training: - -```yaml -# melanoma.yaml -path: /content/datasets/melanoma # Dataset root directory -train: images/train # Train images folder -val: images/val # Validation images folder - -names: - 0: melanoma - 1: benign -``` +To reproduce the results, a GPU with CUDA support is recommended. The model was trained on an NVIDIA Tesla T4 GPU for optimal performance. + +## Dataset Preparation and Pre-Processing + +### Dataset + +The model was trained on the ISIC (International Skin Imaging Collaboration) dataset, a comprehensive collection of dermoscopic images labeled for melanoma and benign conditions. The dataset was divided as follows: + +- **Training Set**: 70% of the data +- **Validation Set**: 20% of the data +- **Testing Set**: 10% of the data + +This split ensures the model has a sufficient amount of data for learning while keeping a balanced validation and testing set for evaluating performance. + +### Pre-Processing -## Training +1. **Resizing**: Images were resized to 640x640 pixels to ensure consistency and efficient processing. +2. **Normalization**: Pixel values were normalized to [0, 1] for faster convergence during training. +3. **Bounding Box Conversion**: Annotations in the ISIC dataset were converted to YOLO format, with bounding boxes specified by the center (x, y), width, and height, normalized by image dimensions. +4. **Data Augmentation**: Techniques such as random rotation, scaling, and flipping were applied to the training data to improve the model’s robustness to variations. -To train YOLO11 on the melanoma dataset, use the following script in Python: +For more details on the dataset and augmentation methods, refer to the [ISIC Archive](https://www.isic-archive.com/). + +## Training the Model + +To train the YOLO11 model, we use transfer learning from a pre-trained checkpoint, fine-tuning it on the melanoma dataset for 50 epochs. The training configuration is specified in the `melanoma.yaml` file, where the dataset paths and class names are defined. + +In the training set, these images are associated with various labels. +image + + +### Example Training Command ```python from ultralytics import YOLO -# Load the pre-trained YOLO11 model -model = YOLO('yolo11n.pt') # Load a lightweight version; options include yolo11s.pt, etc. +# Load a pre-trained YOLO11 model +model = YOLO('yolo11n.pt') # Train the model -model.train(data='melanoma.yaml', epochs=50, imgsz=640) # Modify epochs and image size as needed +model.train(data='melanoma.yaml', epochs=50, imgsz=640) ``` -This will fine-tune the YOLO11 model on your melanoma dataset for 50 epochs. +The model’s performance is evaluated using mean Average Precision (mAP), precision, and recall metrics on the validation set. -## Validation +## Example Inputs and Outputs -After training, evaluate the model’s performance using the validation set: +### Input +Input images should be high-resolution dermoscopic images, such as those from the ISIC dataset, formatted as `.jpg` or `.png` files. -```python -# Validate the model -results = model.val() -``` -The validation metrics, including mAP (mean Average Precision), precision, and recall, will be displayed to help gauge model performance. +### Output +The model outputs bounding boxes and classification labels. Below is an example output for a sample input image. -## Inference -To run inference on new images, use the following code: +### Sample Code for Inference -```python -# Run inference on an image -results = model('/path/to/sample/image.jpg') -results.show() # Display results with bounding boxes and class labels -``` -The model will output bounding boxes around detected lesions with classifications as "melanoma" or "benign." +## Results Visualization + +After training, the model can detect melanoma with high accuracy. Below is a visualization of the performance metrics on the validation set: + +

+ Training metrics +

+ +*Figure: Training and validation loss over epochs* ## Exporting the Model -You can export the model for deployment in different formats like ONNX, TensorFlow, and TensorRT. +To export the model for deployment, YOLO11 provides options for various formats. For instance, to export the model to ONNX: ```python -# Export to ONNX format model.export(format='onnx') ``` -Supported formats include `torchscript`, `onnx`, `openvino`, `tflite`, and more. Refer to the [Ultralytics documentation](https://docs.ultralytics.com/modes/export) for further details. +## Conclusion + +This project demonstrates the power of YOLO11 for real-time melanoma detection in dermoscopic images. With proper training and pre-processing, YOLO11 achieves high accuracy, making it a valuable tool for early skin cancer diagnosis. -## Notes +## References -- Adjust `epochs`, `batch_size`, and `imgsz` based on dataset size and hardware capabilities. -- Fine-tuning larger models like `yolo11s.pt` may yield better results but will require more computational resources. +- ISIC Archive: [ISIC 2018: Skin Lesion Analysis Towards Melanoma Detection](https://www.isic-archive.com/) +- Ultralytics YOLO Documentation: [YOLO Docs](https://docs.ultralytics.com/) -## Acknowledgments +--- -- This project utilizes the YOLO11 model from [Ultralytics](https://github.com/ultralytics/ultralytics). -- ISIC dataset provided by the [ISIC Archive](https://www.isic-archive.com/). +This README provides comprehensive guidance on setup, training, and usage of YOLO11 for melanoma detection. Adjust paths and parameters as necessary for optimal performance on your dataset. From ea91acb20dad6ece3faaf0d5fcdc35fc0abb7f2f Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:40:28 +1000 Subject: [PATCH 03/14] Update README.md --- README.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 08de61f66..6e20614c7 100644 --- a/README.md +++ b/README.md @@ -2,10 +2,7 @@ ## Overview -Melanoma is one of the most aggressive forms of skin cancer, and early detection significantly increases survival rates. This project leverages the YOLO11 (You Only Look Once) deep learning algorithm by Ultralytics to automatically detect melanoma in dermoscopic images. YOLO11 is a cutting-edge object detection model that can detect multiple objects within an image in real time. This project adapts YOLO11 for binary classification of skin lesions as either *melanoma* or *benign*, making it a powerful tool for aiding in early skin cancer diagnosis. The project detects lesions within the ISIC 2017/8 data set with all detections having a minimum Intersection Over Union of 0.8 on the test set and a suitable accuracy for classification. - - - +Melanoma is one of the most aggressive forms of skin cancer, and early detection can significantly increase survival rates. This project leverages the YOLO11 (You Only Look Once) deep learning algorithm by Ultralytics to automatically detect melanoma in dermoscopic images, distinguishing it from other skin conditions like benign lesions and nevus. YOLO11 is a state-of-the-art object detection model. *Figure: Sample output of YOLO11 detecting melanoma in a dermoscopic image* @@ -41,6 +38,8 @@ The model was trained on the ISIC (International Skin Imaging Collaboration) dat - **Validation Set**: 20% of the data - **Testing Set**: 10% of the data +image + This split ensures the model has a sufficient amount of data for learning while keeping a balanced validation and testing set for evaluating performance. ### Pre-Processing @@ -57,7 +56,6 @@ For more details on the dataset and augmentation methods, refer to the [ISIC Arc To train the YOLO11 model, we use transfer learning from a pre-trained checkpoint, fine-tuning it on the melanoma dataset for 50 epochs. The training configuration is specified in the `melanoma.yaml` file, where the dataset paths and class names are defined. In the training set, these images are associated with various labels. -image ### Example Training Command From 4cafb6ba553bfcdc859a4e2f29fe6240525bae51 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:49:52 +1000 Subject: [PATCH 04/14] Update README.md --- README.md | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 6e20614c7..2146f4063 100644 --- a/README.md +++ b/README.md @@ -34,20 +34,24 @@ To reproduce the results, a GPU with CUDA support is recommended. The model was The model was trained on the ISIC (International Skin Imaging Collaboration) dataset, a comprehensive collection of dermoscopic images labeled for melanoma and benign conditions. The dataset was divided as follows: -- **Training Set**: 70% of the data -- **Validation Set**: 20% of the data +- **Training Set**: 80% of the data +- **Validation Set**: 10% of the data - **Testing Set**: 10% of the data -image - This split ensures the model has a sufficient amount of data for learning while keeping a balanced validation and testing set for evaluating performance. ### Pre-Processing -1. **Resizing**: Images were resized to 640x640 pixels to ensure consistency and efficient processing. -2. **Normalization**: Pixel values were normalized to [0, 1] for faster convergence during training. -3. **Bounding Box Conversion**: Annotations in the ISIC dataset were converted to YOLO format, with bounding boxes specified by the center (x, y), width, and height, normalized by image dimensions. -4. **Data Augmentation**: Techniques such as random rotation, scaling, and flipping were applied to the training data to improve the model’s robustness to variations. +Pre-Processing +The preprocessing pipeline prepares the melanoma dataset for efficient and consistent model training. First, a metadata CSV file is generated for each dataset split (train, validation, and test). This metadata file serves as an index, listing each image path along with its corresponding class label (nevus, seborrheic keratosis, or melanoma). Labels are mapped to integers, with benign classes (nevus and seborrheic keratosis) labeled as 0 and malignant (melanoma) as 1. This structure allows for efficient data loading and simplifies referencing images during training. See below. + +image + +Each image is then processed by: +Decoding from JPEG format and resizing to a standardized size of 299x299 pixels, ensuring consistency in model input dimensions. +Normalization, where pixel values are scaled to the [0,1] range for optimized training. +Caching the dataset to reduce I/O bottlenecks, and shuffling the training data with a buffer size of 1000 to ensure varied batches. +Batching and Prefetching: Images are batched into sets of 64, and prefetch is used to load data in the background, preventing delays and ensuring data availability during model training. For more details on the dataset and augmentation methods, refer to the [ISIC Archive](https://www.isic-archive.com/). From a8200beb6629e2c39f4ec6aa60546ecf74a3b285 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:54:21 +1000 Subject: [PATCH 05/14] Update README.md --- README.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 2146f4063..33bd90665 100644 --- a/README.md +++ b/README.md @@ -4,8 +4,9 @@ Melanoma is one of the most aggressive forms of skin cancer, and early detection can significantly increase survival rates. This project leverages the YOLO11 (You Only Look Once) deep learning algorithm by Ultralytics to automatically detect melanoma in dermoscopic images, distinguishing it from other skin conditions like benign lesions and nevus. YOLO11 is a state-of-the-art object detection model. +image -*Figure: Sample output of YOLO11 detecting melanoma in a dermoscopic image* +*Figure: Sample output of YOLO11 detecting a lesion in a dermoscopic image* ## How it Works From 0a53a9a6f491f9696306977aa79b5d78d2a64945 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:58:57 +1000 Subject: [PATCH 06/14] Update README.md --- README.md | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 33bd90665..8178d503f 100644 --- a/README.md +++ b/README.md @@ -80,14 +80,18 @@ The model’s performance is evaluated using mean Average Precision (mAP), preci ## Example Inputs and Outputs ### Input -Input images should be high-resolution dermoscopic images, such as those from the ISIC dataset, formatted as `.jpg` or `.png` files. +The dataset used for melanoma detection consists of dermoscopic images from the ISIC archive. The image dataset includes three main types of lesions: nevus, seborrheic keratosis, and melanoma. Each lesion type is stored in separate folders, and each image has an associated label to identify the type of lesion. The dataset follows the structure required for machine learning tasks, ensuring that each image file name is unique and follows a standardized naming convention (e.g., ISIC_0000000.jpg). +Screen Shot 2024-11-01 at 15 57 33 -### Output -The model outputs bounding boxes and classification labels. Below is an example output for a sample input image. +image + +In the provided dataset folder structure, each lesion type is represented by high-resolution .jpg images. Additionally, there are auxiliary files with names ending in _superpixels.png or _perpixels.png, which appear to contain data that may be used for other types of analysis, such as texture segmentation or pixel intensity mapping. However, for the purpose of training a melanoma detection model, only the main dermoscopic images in .jpg format are used. -### Sample Code for Inference + +### Output +The model outputs bounding boxes and classification labels. ## Results Visualization From 31c74c81fb2462ad25918be8f01e2a876784d328 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:59:27 +1000 Subject: [PATCH 07/14] Update README.md --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 8178d503f..c9754e0af 100644 --- a/README.md +++ b/README.md @@ -84,9 +84,8 @@ The dataset used for melanoma detection consists of dermoscopic images from the Screen Shot 2024-11-01 at 15 57 33 -image - In the provided dataset folder structure, each lesion type is represented by high-resolution .jpg images. Additionally, there are auxiliary files with names ending in _superpixels.png or _perpixels.png, which appear to contain data that may be used for other types of analysis, such as texture segmentation or pixel intensity mapping. However, for the purpose of training a melanoma detection model, only the main dermoscopic images in .jpg format are used. +image From 9e0659404b1ab9dccf316f351a5b1df8e2e51315 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 15:59:50 +1000 Subject: [PATCH 08/14] Update README.md --- README.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/README.md b/README.md index c9754e0af..2e221cf2c 100644 --- a/README.md +++ b/README.md @@ -85,6 +85,8 @@ The dataset used for melanoma detection consists of dermoscopic images from the Screen Shot 2024-11-01 at 15 57 33 In the provided dataset folder structure, each lesion type is represented by high-resolution .jpg images. Additionally, there are auxiliary files with names ending in _superpixels.png or _perpixels.png, which appear to contain data that may be used for other types of analysis, such as texture segmentation or pixel intensity mapping. However, for the purpose of training a melanoma detection model, only the main dermoscopic images in .jpg format are used. + + image From edf55d42993fcf3d9f452ca832139bb1dcd1407e Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 16:02:27 +1000 Subject: [PATCH 09/14] Update README.md --- README.md | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 2e221cf2c..cc8dbb8ba 100644 --- a/README.md +++ b/README.md @@ -97,21 +97,12 @@ The model outputs bounding boxes and classification labels. ## Results Visualization -After training, the model can detect melanoma with high accuracy. Below is a visualization of the performance metrics on the validation set: +After training, the model can detect melanoma with high accuracy. -

- Training metrics -

+image -*Figure: Training and validation loss over epochs* +*Figure: Training and validation loss over epochs. This was from an earlier test, eventually, 31 epochs were chosen* -## Exporting the Model - -To export the model for deployment, YOLO11 provides options for various formats. For instance, to export the model to ONNX: - -```python -model.export(format='onnx') -``` ## Conclusion From de7c460350a8e4c6fbfa67bbfeb8b4118e54de99 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 16:34:35 +1000 Subject: [PATCH 10/14] Add files via upload --- dataset.py | 76 +++++++++++++++++ modules.py | 237 +++++++++++++++++++++++++++++++++++++++++++++++++++++ predict.py | 66 +++++++++++++++ train.py | 114 ++++++++++++++++++++++++++ 4 files changed, 493 insertions(+) create mode 100644 dataset.py create mode 100644 modules.py create mode 100644 predict.py create mode 100644 train.py diff --git a/dataset.py b/dataset.py new file mode 100644 index 000000000..73ad0c7a9 --- /dev/null +++ b/dataset.py @@ -0,0 +1,76 @@ +import torch +from torch.utils.data import Dataset +import pandas as pd +import os +import cv2 +import numpy as np + +class ISICDataset(Dataset): + """Custom Dataset class for YOLO model with ISIC data.""" + + def __init__(self, image_dir, mask_dir, labels_path, image_size): + self.image_size = image_size + self.image_dir = image_dir + self.mask_dir = mask_dir + self.labels = pd.read_csv(labels_path) + + # Load all image file names in the directory + self.image_files = [f for f in os.listdir(image_dir) if f.endswith('.jpg')] + self.samples = [self._process_sample(i) for i in range(len(self.image_files))] + + def __len__(self): + return len(self.image_files) + + def __getitem__(self, idx): + return self.samples[idx] + + def _process_sample(self, idx): + """Helper function to process and return a single sample (image and target vector).""" + # Load image and mask + image = self._load_image(idx) + mask = self._load_mask(idx) + + # Resize image and mask to the target size + image = cv2.resize(image, (self.image_size, self.image_size)).astype(np.float32) / 255.0 + mask = cv2.resize(mask, (self.image_size, self.image_size)) + + # Obtain bounding box coordinates from the mask + x, y, w, h = self._extract_bounding_box(mask) + + # Retrieve label probabilities + label1, label2 = self.labels.iloc[idx, 1:3] + total_prob = label1 + label2 + + # Create target vector + target_vector = np.array( + [x + w / 2, y + h / 2, w, h, total_prob, label1, label2], + dtype=np.float32 + ) + + # Convert image to tensor format (C, H, W) + image_tensor = torch.tensor(image.transpose(2, 0, 1), dtype=torch.float32) + target_tensor = torch.tensor(target_vector, dtype=torch.float32) + + return image_tensor, target_tensor + + def _load_image(self, idx): + """Loads an image given an index.""" + img_name = os.path.join(self.image_dir, self.image_files[idx]) + return cv2.imread(img_name) + + def _load_mask(self, idx): + """Loads the mask corresponding to the image at the given index.""" + mask_name = os.path.join( + self.mask_dir, self.image_files[idx].replace('.jpg', '_segmentation.png') + ) + return cv2.imread(mask_name, cv2.IMREAD_GRAYSCALE) + + def _extract_bounding_box(self, mask): + """Extracts the bounding box from the mask image.""" + _, thresh = cv2.threshold(mask, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU) + contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE) + + if contours: + x, y, w, h = cv2.boundingRect(contours[0]) + return x, y, w, h + return 0, 0, 0, 0 # Return zero box if no contours are found diff --git a/modules.py b/modules.py new file mode 100644 index 000000000..3ea4365ac --- /dev/null +++ b/modules.py @@ -0,0 +1,237 @@ +import torch +import torch.nn as nn +import torch.nn.functional as F +import numpy as np + +# Device configuration +device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') +if torch.cuda.is_available(): + print("cuda") +if not torch.cuda.is_available(): + print("cpu") + +class YOLO(nn.Module): + + #REFERENCE: yolov3-tiny.cfg from https://github.com/pjreddie/darknet/blob/master/cfg + #Used as basis for what layers were needed + def __init__(self, num_classes): + super(YOLO, self).__init__() + self.num_classes = num_classes + layers = [] + filters = [16,32,64,128,256,512] + in_channels = 3 + #Convulution layers and maxpooling + for i in filters: + layers.append(nn.Conv2d(in_channels, i, kernel_size=3, stride=1, padding=1, bias=False)) + in_channels = i + layers.append(nn.BatchNorm2d(i)) + layers.append(nn.LeakyReLU(0.1, True)) #might be false + layers.append(nn.MaxPool2d(kernel_size=2, stride=2)) #Hopefully works + layers.append(nn.Conv2d(512, 1024, kernel_size=3, stride=1, padding=1, bias=False)) + layers.append(nn.BatchNorm2d(1024)) + layers.append(nn.LeakyReLU(0.1, True)) + + layers.append(nn.Conv2d(1024, 256, kernel_size=1, stride=1, padding=1, bias=False)) + layers.append(nn.BatchNorm2d(256)) + layers.append(nn.LeakyReLU(0.1, True)) + + layers.append(nn.Conv2d(256, 512, kernel_size=3, stride=1, padding=1, bias=False)) + layers.append(nn.BatchNorm2d(512)) + layers.append(nn.LeakyReLU(0.1, True)) + + layers.append(nn.Conv2d(512, 255, kernel_size=1, stride=1, padding=1, bias=True)) + self.conv_start = nn.Sequential(*layers) + + #Detection layer - given anchors + self.anchor1 = [(81,82), (135,169), (344,319)] #Anchors depends on image? + + #Route layer could go here + self.conv_mid = nn.Sequential( + nn.Conv2d(255, 128, kernel_size=1, stride=1, padding=1, bias=False), + nn.BatchNorm2d(128), + nn.LeakyReLU(0.1, True), + nn.Upsample(scale_factor=2, mode="bilinear")) + #Another route layer maybe + self.conv_end = nn.Sequential( + nn.Conv2d(128,256,kernel_size=3,stride=1,padding=1,bias=False), + nn.BatchNorm2d(256), + nn.LeakyReLU(0.1, True), + nn.Conv2d(256, 255, kernel_size=1, stride=1, padding=1, bias=True)) + + #Another detection layer + self.anchor2 = [(10,14), (23,27), (37,58)] + + def forward(self, x): + out = self.conv_start(x) + out = out.data + a = self.predict_transform(out, 416, self.anchor1, self.num_classes) + out = self.conv_mid(out) + out = self.conv_end(out) + out = out.data + b = self.predict_transform(out, 416, self.anchor2, self.num_classes) + return torch.cat((a, b), 1) + + def predict_transform(self, prediction, inp_dim, anchors, num_classes): + """ + Decodes the output from the convolution layers and arranges the information into a usable format. + The below reference was used for a base for this function. + REFERENCE: refer to reference 2 in README. + """ + batch_size = prediction.size(0) + stride = inp_dim // prediction.size(2) + grid_size = inp_dim // stride + bbox_attrs = 5 + num_classes + num_anchors = len(anchors) + + #Rearranges the feature map to (batch_size, number of boxes, box_attributes) + prediction = prediction.view(batch_size, bbox_attrs*num_anchors, grid_size*grid_size) + prediction = prediction.transpose(1,2).contiguous() + prediction = prediction.view(batch_size, grid_size*grid_size*num_anchors, bbox_attrs) + anchors = [(a[0]/stride, a[1]/stride) for a in anchors] + #Get the centre_X, centre_Y and object confidence between 1 and 0 + prediction[:,:,0] = torch.sigmoid(prediction[:,:,0]) + prediction[:,:,1] = torch.sigmoid(prediction[:,:,1]) + prediction[:,:,4] = torch.sigmoid(prediction[:,:,4]) + #Add the center offsets + grid = np.arange(grid_size) + a,b = np.meshgrid(grid, grid) + + x_offset = torch.FloatTensor(a).view(-1,1) + y_offset = torch.FloatTensor(b).view(-1,1) + + x_offset = x_offset.to(device) + y_offset = y_offset.to(device) + + x_y_offset = torch.cat((x_offset, y_offset), 1).repeat(1,num_anchors).view(-1,2).unsqueeze(0) + + prediction[:,:,:2] += x_y_offset + #log space transform height and the width + #so that all boxes are on the same scale + anchors = torch.FloatTensor(anchors) + anchors = anchors.to(device) + + #arrange the probabilities of the classes + anchors = anchors.repeat(grid_size*grid_size, 1).unsqueeze(0) + prediction[:,:,2:4] = torch.exp(prediction[:,:,2:4])*anchors + prediction[:,:,5: 5 + num_classes] = torch.sigmoid((prediction[:,:, 5 : 5 + num_classes])) + prediction[:,:,:4] *= stride + return prediction + + +def calculate_iou(pred, label): + """ + Caculates the IoUs of a given list of boxes. + Used to determine accuracy of given bounding boxes. + Also is a key part of the loss function. + """ + px, py, pw, ph = pred[:,0], pred[:,1], pred[:,2], pred[:,3] + lx, ly, lw, lh = label[0], label[1], label[2], label[3] + box_a = [px-(pw/2), py-(ph/2), px+(pw/2), py+(ph/2)] + box_b = [lx-(lw/2), ly-(lh/2), lx+(lw/2), ly+(lh/2)] + + # determine the (x, y) of the corners of intersection area + ax = torch.clamp(box_a[0], min=box_b[0]) + ay = torch.clamp(box_a[1], min=box_b[1]) + bx = torch.clamp(box_a[2], max=box_b[2]) + by = torch.clamp(box_a[3], max=box_b[3]) + + # compute the area of intersection + intersect = torch.abs(torch.clamp((bx - ax), min=0) * torch.clamp((by - ay), min=0)) + + # compute the area of both the prediction and ground-truth + area_a = torch.abs((box_a[2] - box_a[0]) * (box_a[3] - box_a[1])) + area_b = torch.abs((box_b[2] - box_b[0]) * (box_b[3] - box_b[1])) + + # compute the iou + iou = intersect / (area_a + area_b - intersect) + iou = torch.reshape(iou, (776, 3)) + return iou + +class YOLO_loss(nn.Module): + """ + Given one batch at a time, the loss of the predictions is calculated. + The formulas used to calculate loss are from the reference below. + REFERENCE: refer to reference 3 in README. + """ + def __init__(self): + super(YOLO_loss, self).__init__() + + def forward(pred, label): + #Constants + no_object = 0.5 #Puts less emphasis on loss from boxes with no object + #Rearrange predictions to have one box shape on each line + boxes = torch.reshape(pred, (776, 3)) + + #IoU + iou = calculate_iou(pred, label) + iou, best_boxes = torch.max(iou, dim=1) + + #Loss set up + class_loss = torch.zeros(776) + coord_loss = torch.zeros(776) + conf_loss = torch.zeros(776) + + #Calculate loss + i = 0 + for idx in best_boxes: + box = boxes[i][idx] + #coordinate loss + xy_loss = (label[0]-box[0])**2 + (label[1]-box[1])**2 + wh_loss = ((label[0])**(1/2)-(box[0])**(1/2))**2 + ((label[1])**(1/2)-(box[1])**(1/2))**2 + coord_loss[i] = (xy_loss + wh_loss) + #Check if there was a detection + if box[4] > 0.8: #There was + #classification loss + class_loss[i] = (label[5] - box[5])**2 + (label[6] - box[6])**2 + #confidence loss + conf_loss[i] = (label[4] - box[4])**2 + else: #There wasn't + conf_loss[i] = no_object*((label[4] - box[4])**2) + i += 1 + + #Final count + total_loss = 0 + total_loss += torch.sum(coord_loss) + total_loss += torch.sum(class_loss) + total_loss += torch.sum(conf_loss) + + return total_loss + +def single_iou(pred, label): + """ + Calculates the IoU of a single box + """ + px, py, pw, ph = pred[:,0], pred[:,1], pred[:,2], pred[:,3] + lx, ly, lw, lh = label[0], label[1], label[2], label[3] + box_a = [px-(pw/2), py-(ph/2), px+(pw/2), py+(ph/2)] + box_b = [lx-(lw/2), ly-(lh/2), lx+(lw/2), ly+(lh/2)] + + # determine the (x, y) of the corners of intersection area + ax = torch.clamp(box_a[0], min=box_b[0]) + ay = torch.clamp(box_a[1], min=box_b[1]) + bx = torch.clamp(box_a[2], max=box_b[2]) + by = torch.clamp(box_a[3], max=box_b[3]) + + # compute the area of intersection + intersect = torch.abs(torch.clamp((bx - ax), min=0) * torch.clamp((by - ay), min=0)) + + # compute the area of both the prediction and ground-truth + area_a = torch.abs((box_a[2] - box_a[0]) * (box_a[3] - box_a[1])) + area_b = torch.abs((box_b[2] - box_b[0]) * (box_b[3] - box_b[1])) + + # compute the iou + iou = intersect / (area_a + area_b - intersect) + return iou + +def filter_boxes(pred): + """ + Returns highest confidence box that has detected something + """ + best_box = None + highest_conf = 0 + for i in range(pred.size(0)): + box = pred[i,:] + if box[4] >= highest_conf: + best_box = box + highest_conf = box[4] + return best_box \ No newline at end of file diff --git a/predict.py b/predict.py new file mode 100644 index 000000000..16053b3bd --- /dev/null +++ b/predict.py @@ -0,0 +1,66 @@ +from modules import YOLO, filter_boxes +from dataset import ISICDataset +import matplotlib.pyplot as plt +import matplotlib.patches as patches +import cv2 +import torch +import numpy as np + +def plot_boxes(image_tensor, bounding_box): + """ + Plots the bounding box and label on an image. + + Args: + image_tensor (torch.Tensor): The image tensor of shape (3, 416, 416). + bounding_box (torch.Tensor): The bounding box tensor with format [center_x, center_y, width, height, score, label1, label2]. + """ + image_tensor = image_tensor.cpu().permute(1, 2, 0) # Reshape for plotting + fig, ax = plt.subplots() + ax.imshow(image_tensor) + + if bounding_box is not None: + box_coords = bounding_box.cpu() + x, y, w, h = box_coords[0] - box_coords[2] / 2, box_coords[1] - box_coords[3] / 2, box_coords[2], box_coords[3] + rect = patches.Rectangle((x, y), w, h, linewidth=1, edgecolor='r', facecolor='none') + + # Determine label based on probabilities + label = "melanoma" if box_coords[5] > box_coords[6] else "seborrheic keratosis" + + # Add rectangle patch and label text + ax.add_patch(rect) + plt.text(x, y, label, bbox=dict(facecolor='red', alpha=0.5), color='white') + + plt.axis("off") + plt.show() + +def predict(image_path, model): + """ + Predicts the bounding box and class label for an image using the model. + + Args: + image_path (str): Path to the input image. + model (YOLO): Trained YOLO model. + """ + # Load and preprocess the image + image = cv2.imread(image_path) + image = cv2.resize(image, (416, 416)) + image = torch.from_numpy(image.transpose((2, 0, 1))).float().div(255).unsqueeze(0).to(device) + + # Model prediction + predictions = model(image) + best_box = filter_boxes(predictions[0]) + + # Display the image with the predicted bounding box + plot_boxes(image.squeeze(0), best_box) + +# Load model and weights +model = YOLO(num_classes=2) +checkpoint_path = "/content/drive/MyDrive/Uni/COMP3710/model.pt" +checkpoint = torch.load(checkpoint_path, map_location=device) +model.load_state_dict(checkpoint['model_state_dict']) +model.to(device) +model.eval() + +# Run prediction on an image +image_path = "/path/to/your/image.jpg" # Specify the image path here +predict(image_path, model) diff --git a/train.py b/train.py new file mode 100644 index 000000000..fc681c4d3 --- /dev/null +++ b/train.py @@ -0,0 +1,114 @@ +import torch +import torch.nn as nn +import torch.nn.functional as F +import time + +from dataset import * +from modules import * + + +# Device configuration +device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') +if torch.cuda.is_available(): + print("cuda") +if not torch.cuda.is_available(): + print("cpu") + +#hyperparameters +epochs = 10 +learning_rate=0.001 +image_size = 416 +batch_size = 10 + +#Train data - change directories as needed +mask_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Training_Part1_GroundTruth/' +image_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Training_Data/' +labels = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Training_Part3_GroundTruth.csv' +train_dataset = ISICDataset(image_dir, mask_dir, labels, image_size) +train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) + +#Model +model = YOLO(2) +model.to(device) +checkpoint_path = "model.pt" + +#optimizer and loss +optimizer = torch.optim.Adam(model.parameters(), learning_rate) +criterion = YOLO_loss() + +#learning rate schedule, using because SGD is dumb, adam has its own learning rate +total_step = len(train_dataloader) +scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer,max_lr=learning_rate, + steps_per_epoch=total_step, epochs=epochs) + +#Train +model.train() +start = time.time() +for epoch in range(epochs): + for i, (images, labels) in enumerate(train_dataloader): + images = images.to(device) + labels = labels.to(device) + + #Forward pass + outputs = model(images) + total_loss = 0 + for a in range(batch_size): + loss = criterion(outputs[a], labels[a]) + total_loss += loss + + #Backwards and optimize + optimizer.zero_grad() + total_loss.requires_grad = True + total_loss.backward() + optimizer.step() + + if (i+1) % 50 == 0: + print("Epoch [{}/{}], Step[{},{}] Loss: {:.5f}".format(epoch+1, epochs, i+1, total_step, total_loss.item())) + torch.save({ + 'epoch': epoch, + 'model_state_dict': model.state_dict(), + 'optimizer_state_dict': optimizer.state_dict(), + 'loss': total_loss, + }, checkpoint_path) + + scheduler.step() +end = time.time() +elapsed = end - start +print("Training took {} secs or {} mins.".format(elapsed, elapsed/60)) + +#Test data +mask_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Test_v2_Part1_GroundTruth/' +image_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Test_v2_Data/' +labels = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Test_v2_Part3_GroundTruth.csv' +test_dataset = ISICDataset(image_dir, mask_dir, labels, 416) +test_dataloader = DataLoader(test_dataset, batch_size, shuffle=True) + +#Test +model.eval() +torch.set_grad_enabled(True) +start = time.time() +total = 0 +total_step = len(test_dataloader) + +for i, (images, labels) in enumerate(test_dataloader): + images = images.to(device) + labels = labels.to(device) + outputs = model(images) + + #Calculate IoU + for a in range(batch_size): + best_box = filter_boxes(outputs[a]) + if best_box is not None: + best_box = torch.reshape(best_box, (1, 7)) + iou = single_iou(best_box, labels[a,:]) + total += iou[0] + + #Keep track of average + average = total/(i+1) + + if (i+1) % 50 == 0: + print("Step[{},{}] IoU average: {:.5f}".format(i+1, total_step, average)) + +end = time.time() +elapsed = end - start +print("Testing took {} secs or {} mins.".format(elapsed, elapsed/60)) \ No newline at end of file From 2d53cdfcc786970bd50023c6f7e48ebe5940ff27 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 16:37:49 +1000 Subject: [PATCH 11/14] Update train.py Clarified and improved commenting for YOLO training --- train.py | 131 +++++++++++++++++++++++++++---------------------------- 1 file changed, 65 insertions(+), 66 deletions(-) diff --git a/train.py b/train.py index fc681c4d3..5835e4b16 100644 --- a/train.py +++ b/train.py @@ -1,114 +1,113 @@ import torch import torch.nn as nn -import torch.nn.functional as F +import torch.nn.functional as F import time - -from dataset import * -from modules import * - +from dataset import ISICDataset +from modules import YOLO, YOLO_loss, filter_boxes, single_iou +from torch.utils.data import DataLoader # Device configuration device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') -if torch.cuda.is_available(): - print("cuda") -if not torch.cuda.is_available(): - print("cpu") - -#hyperparameters +print("Using device:", "cuda" if torch.cuda.is_available() else "cpu") + +# Hyperparameters epochs = 10 -learning_rate=0.001 +learning_rate = 0.001 image_size = 416 batch_size = 10 -#Train data - change directories as needed +# Training Data Paths - Adjust directories as needed mask_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Training_Part1_GroundTruth/' image_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Training_Data/' labels = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Training_Part3_GroundTruth.csv' + +# Loading Training Dataset and DataLoader train_dataset = ISICDataset(image_dir, mask_dir, labels, image_size) train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True) -#Model -model = YOLO(2) +# Model Initialization +model = YOLO(num_classes=2) model.to(device) checkpoint_path = "model.pt" -#optimizer and loss -optimizer = torch.optim.Adam(model.parameters(), learning_rate) +# Optimizer and Loss Function +optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) criterion = YOLO_loss() -#learning rate schedule, using because SGD is dumb, adam has its own learning rate +# Learning Rate Scheduler (OneCycleLR) - Adjusts learning rate dynamically total_step = len(train_dataloader) -scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer,max_lr=learning_rate, +scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer, max_lr=learning_rate, steps_per_epoch=total_step, epochs=epochs) -#Train +# Training Loop +print("Starting training...") model.train() -start = time.time() +start_time = time.time() + for epoch in range(epochs): for i, (images, labels) in enumerate(train_dataloader): - images = images.to(device) - labels = labels.to(device) - - #Forward pass - outputs = model(images) - total_loss = 0 - for a in range(batch_size): - loss = criterion(outputs[a], labels[a]) - total_loss += loss - - #Backwards and optimize - optimizer.zero_grad() - total_loss.requires_grad = True - total_loss.backward() - optimizer.step() - - if (i+1) % 50 == 0: - print("Epoch [{}/{}], Step[{},{}] Loss: {:.5f}".format(epoch+1, epochs, i+1, total_step, total_loss.item())) + images, labels = images.to(device), labels.to(device) # Move data to device + + # Forward pass + outputs = model(images) # Get model predictions + total_loss = sum(criterion(outputs[a], labels[a]) for a in range(batch_size)) # Calculate batch loss + + # Backward pass and optimization + optimizer.zero_grad() # Clear gradients + total_loss.backward() # Backpropagation + optimizer.step() # Update model parameters + scheduler.step() # Adjust learning rate + + # Log training progress every 50 steps + if (i + 1) % 50 == 0: + print(f"Epoch [{epoch + 1}/{epochs}], Step [{i + 1}/{total_step}], Loss: {total_loss.item():.5f}") + # Save checkpoint torch.save({ 'epoch': epoch, 'model_state_dict': model.state_dict(), 'optimizer_state_dict': optimizer.state_dict(), 'loss': total_loss, - }, checkpoint_path) + }, checkpoint_path) - scheduler.step() -end = time.time() -elapsed = end - start -print("Training took {} secs or {} mins.".format(elapsed, elapsed/60)) +elapsed_time = time.time() - start_time +print(f"Training completed in {elapsed_time:.2f} seconds or {elapsed_time / 60:.2f} minutes.") -#Test data +# Test Data Paths - Update directories as needed mask_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Test_v2_Part1_GroundTruth/' image_dir = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Test_v2_Data/' labels = '/Users/mariam/Downloads/COMP3710_YOLO/ISIC-2017_Test_v2_Part3_GroundTruth.csv' -test_dataset = ISICDataset(image_dir, mask_dir, labels, 416) -test_dataloader = DataLoader(test_dataset, batch_size, shuffle=True) -#Test +# Loading Test Dataset and DataLoader +test_dataset = ISICDataset(image_dir, mask_dir, labels, image_size) +test_dataloader = DataLoader(test_dataset, batch_size=batch_size, shuffle=True) + +# Testing Loop +print("Starting testing...") model.eval() -torch.set_grad_enabled(True) -start = time.time() -total = 0 +torch.set_grad_enabled(False) # Disable gradients for testing +start_time = time.time() +total_iou = 0 # Cumulative IoU for averaging total_step = len(test_dataloader) for i, (images, labels) in enumerate(test_dataloader): - images = images.to(device) - labels = labels.to(device) - outputs = model(images) + images, labels = images.to(device), labels.to(device) # Move data to device + outputs = model(images) # Get model predictions - #Calculate IoU + # Calculate IoU for each batch for a in range(batch_size): - best_box = filter_boxes(outputs[a]) + best_box = filter_boxes(outputs[a]) # Select box with highest confidence if best_box is not None: - best_box = torch.reshape(best_box, (1, 7)) - iou = single_iou(best_box, labels[a,:]) - total += iou[0] + best_box = torch.reshape(best_box, (1, 7)) # Reshape to required format + iou = single_iou(best_box, labels[a, :]) # Calculate IoU between prediction and ground truth + total_iou += iou[0] # Accumulate IoU - #Keep track of average - average = total/(i+1) + # Calculate average IoU for progress monitoring + average_iou = total_iou / (i + 1) - if (i+1) % 50 == 0: - print("Step[{},{}] IoU average: {:.5f}".format(i+1, total_step, average)) + # Log testing progress every 50 steps + if (i + 1) % 50 == 0: + print(f"Step [{i + 1}/{total_step}], IoU Average: {average_iou:.5f}") -end = time.time() -elapsed = end - start -print("Testing took {} secs or {} mins.".format(elapsed, elapsed/60)) \ No newline at end of file +# Calculate total time for testing +elapsed_time = time.time() - start_time +print(f"Testing completed in {elapsed_time:.2f} seconds or {elapsed_time / 60:.2f} minutes.") From 371db657a056ff78ba346d6673f584fb8c85125d Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Fri, 1 Nov 2024 16:58:20 +1000 Subject: [PATCH 12/14] Update README.md Adding new output images --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index cc8dbb8ba..079ff60ba 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,7 @@ Melanoma is one of the most aggressive forms of skin cancer, and early detection YOLO11 is a single-stage object detection model that processes the entire image in a single forward pass, predicting bounding boxes and classification scores simultaneously. It divides the input image into a grid, with each grid cell responsible for detecting an object within its bounds. Using anchor boxes, the model generates bounding box coordinates and confidence scores, optimized for melanoma detection by training on a labeled dataset of dermoscopic images. The final model can localize and classify skin lesions as either melanoma or benign in real time. + ## Dependencies To run this project, the following dependencies are required: @@ -94,6 +95,11 @@ In the provided dataset folder structure, each lesion type is represented by hig ### Output The model outputs bounding boxes and classification labels. +Screen Shot 2024-11-01 at 16 57 28 + + +Screen Shot 2024-11-01 at 16 57 50 + ## Results Visualization From 4ce3242832431e1131a0abeb510488cefd96ef9a Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Tue, 12 Nov 2024 10:06:50 +1000 Subject: [PATCH 13/14] Linking updated and fixed pull request --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 079ff60ba..8712ee6fd 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,10 @@ + +# Please see Updated Pull Request for Implemented Feedback! +https://github.com/shakes76/PatternAnalysis-2024/pull/193#issue-2650661510 + + + + # Melanoma Detection using YOLO11 ## Overview From 6682fcf41561a1faa067be553c5092ef13c488e7 Mon Sep 17 00:00:00 2001 From: Mariam <112161752+mermalade0325@users.noreply.github.com> Date: Tue, 12 Nov 2024 10:07:51 +1000 Subject: [PATCH 14/14] Redirecting to topic recognition branch as per tutor feedback --- README.md | 129 +----------------------------------------------------- 1 file changed, 1 insertion(+), 128 deletions(-) diff --git a/README.md b/README.md index 8712ee6fd..fd0a26ba0 100644 --- a/README.md +++ b/README.md @@ -1,131 +1,4 @@ -# Please see Updated Pull Request for Implemented Feedback! +# Please see Updated Pull Request for Implemented Feedback! All details under Topic_Recognition https://github.com/shakes76/PatternAnalysis-2024/pull/193#issue-2650661510 - - - -# Melanoma Detection using YOLO11 - -## Overview - -Melanoma is one of the most aggressive forms of skin cancer, and early detection can significantly increase survival rates. This project leverages the YOLO11 (You Only Look Once) deep learning algorithm by Ultralytics to automatically detect melanoma in dermoscopic images, distinguishing it from other skin conditions like benign lesions and nevus. YOLO11 is a state-of-the-art object detection model. - -image - -*Figure: Sample output of YOLO11 detecting a lesion in a dermoscopic image* - -## How it Works - -YOLO11 is a single-stage object detection model that processes the entire image in a single forward pass, predicting bounding boxes and classification scores simultaneously. It divides the input image into a grid, with each grid cell responsible for detecting an object within its bounds. Using anchor boxes, the model generates bounding box coordinates and confidence scores, optimized for melanoma detection by training on a labeled dataset of dermoscopic images. The final model can localize and classify skin lesions as either melanoma or benign in real time. - - -## Dependencies - -To run this project, the following dependencies are required: - -- **Python**: 3.10 -- **Ultralytics**: 8.3.2 (includes YOLO11) -- **PyTorch**: 2.4.1+cu121 -- **OpenCV**: 4.5.3 -- **Matplotlib**: 3.4.2 - -Ensure you install the dependencies via: -```bash -pip install ultralytics opencv-python-headless matplotlib -``` - -To reproduce the results, a GPU with CUDA support is recommended. The model was trained on an NVIDIA Tesla T4 GPU for optimal performance. - -## Dataset Preparation and Pre-Processing - -### Dataset - -The model was trained on the ISIC (International Skin Imaging Collaboration) dataset, a comprehensive collection of dermoscopic images labeled for melanoma and benign conditions. The dataset was divided as follows: - -- **Training Set**: 80% of the data -- **Validation Set**: 10% of the data -- **Testing Set**: 10% of the data - -This split ensures the model has a sufficient amount of data for learning while keeping a balanced validation and testing set for evaluating performance. - -### Pre-Processing - -Pre-Processing -The preprocessing pipeline prepares the melanoma dataset for efficient and consistent model training. First, a metadata CSV file is generated for each dataset split (train, validation, and test). This metadata file serves as an index, listing each image path along with its corresponding class label (nevus, seborrheic keratosis, or melanoma). Labels are mapped to integers, with benign classes (nevus and seborrheic keratosis) labeled as 0 and malignant (melanoma) as 1. This structure allows for efficient data loading and simplifies referencing images during training. See below. - -image - -Each image is then processed by: -Decoding from JPEG format and resizing to a standardized size of 299x299 pixels, ensuring consistency in model input dimensions. -Normalization, where pixel values are scaled to the [0,1] range for optimized training. -Caching the dataset to reduce I/O bottlenecks, and shuffling the training data with a buffer size of 1000 to ensure varied batches. -Batching and Prefetching: Images are batched into sets of 64, and prefetch is used to load data in the background, preventing delays and ensuring data availability during model training. - -For more details on the dataset and augmentation methods, refer to the [ISIC Archive](https://www.isic-archive.com/). - -## Training the Model - -To train the YOLO11 model, we use transfer learning from a pre-trained checkpoint, fine-tuning it on the melanoma dataset for 50 epochs. The training configuration is specified in the `melanoma.yaml` file, where the dataset paths and class names are defined. - -In the training set, these images are associated with various labels. - - -### Example Training Command - -```python -from ultralytics import YOLO - -# Load a pre-trained YOLO11 model -model = YOLO('yolo11n.pt') - -# Train the model -model.train(data='melanoma.yaml', epochs=50, imgsz=640) -``` - -The model’s performance is evaluated using mean Average Precision (mAP), precision, and recall metrics on the validation set. - -## Example Inputs and Outputs - -### Input -The dataset used for melanoma detection consists of dermoscopic images from the ISIC archive. The image dataset includes three main types of lesions: nevus, seborrheic keratosis, and melanoma. Each lesion type is stored in separate folders, and each image has an associated label to identify the type of lesion. The dataset follows the structure required for machine learning tasks, ensuring that each image file name is unique and follows a standardized naming convention (e.g., ISIC_0000000.jpg). - -Screen Shot 2024-11-01 at 15 57 33 - -In the provided dataset folder structure, each lesion type is represented by high-resolution .jpg images. Additionally, there are auxiliary files with names ending in _superpixels.png or _perpixels.png, which appear to contain data that may be used for other types of analysis, such as texture segmentation or pixel intensity mapping. However, for the purpose of training a melanoma detection model, only the main dermoscopic images in .jpg format are used. - - -image - - - -### Output -The model outputs bounding boxes and classification labels. - -Screen Shot 2024-11-01 at 16 57 28 - - -Screen Shot 2024-11-01 at 16 57 50 - - -## Results Visualization - -After training, the model can detect melanoma with high accuracy. - -image - -*Figure: Training and validation loss over epochs. This was from an earlier test, eventually, 31 epochs were chosen* - - -## Conclusion - -This project demonstrates the power of YOLO11 for real-time melanoma detection in dermoscopic images. With proper training and pre-processing, YOLO11 achieves high accuracy, making it a valuable tool for early skin cancer diagnosis. - -## References - -- ISIC Archive: [ISIC 2018: Skin Lesion Analysis Towards Melanoma Detection](https://www.isic-archive.com/) -- Ultralytics YOLO Documentation: [YOLO Docs](https://docs.ultralytics.com/) - ---- - -This README provides comprehensive guidance on setup, training, and usage of YOLO11 for melanoma detection. Adjust paths and parameters as necessary for optimal performance on your dataset.