shakes76 · DaveD123321 · Nov 14, 2022
diff --git a/recognition/s4627234_3710project/README.MD b/recognition/s4627234_3710project/README.MD
@@ -0,0 +1,159 @@
+
+# Using an U-NET to segment the ISIC dataset
+
+
+## Author
+Name: Jingming Dai 
+
+Student number: s4627234 / 46272346
+
+This project was completed for COMP3710
+
+
+
+## Description
+This project uses Improved UNet to split the ISIC dataset with a minimum Dice similarity coefficient of 0.8 for all labels on the test set. 
+
+In image segmentation tasks, especially medical image segmentation, U-Net is undoubtedly one of the most successful methods. U-net uses encoder (down-sampling) and decoder (up-sampling) structural connections. This project applies a technique that uses the Dice loss to train the model. Compared with the common U-Net, this model has better segmentation effect and higher dice similarity coefficient.
+
+
+## Data set description:
+The ISIC package includes four folders, including Training, Testing, Validation and their Ground Truth files. After downloading the files, we can put them into the newly created data folder. The directory of the files needs to be arranged as follows.
+
+* data
+    * data_ISIC
+        * ISIC-2017_Training_Data
+        * ISIC-2017_Training_Part1_GroundTruth
+        * ISIC-2017_Test_v2_Data
+        * ISIC-2017_Test_v2_Part1_GroundTruth
+        * ISIC-2017_Validation_Data
+        * ISIC-2017_Validation_Part1_GroundTruth
+
+![image](./images/data_image_example.png)
+
+
+## How it works:
+
+### UNet:
+The structure of U-Net is shown in the following figure, the left part can be regarded as an encoder (down), and the right part can be regarded as a decoder (up).
+
+![image](./images/UNet.png)
+
+The encoder has four submodules, each submodule contains two convolution layers, and each submodule is followed by a down-sampling layer implemented by MaxPool2D. 
+
+The decoder consists of four sub-modules, and the resolution is sequentially increased by up-sampling operations until it is consistent with the resolution of the input image. The decoder uses two 3x3 un-padded convolutions and . After four up-sampling, a final 1x1 convolution with a sigmoid activation function is applied.
+
+
+### Improved_UNet : 
+The algorithm is a modified version of UNet created by Isensee and colleagues, and below is the improved UNet graph. The UNet() and UNet_imp() functions in modules.py build the UNet model before and after improvement. The model is built according to the following UNet architecture, but changing the output "softmax" to "sigmoid" at the end so the output is a mask of one channel.
+
+![image](./images/model_imp.png)
+
+#### Important parts of improved UNet:
+
+All the convolution, context, element-wise-sum in encoding are integrated into the up_imp() function.
+
+Part of upsampling, concatenate, and localization in decoding are integrated into the down_imp() function.
+
+__Context module:__
+Described by 2 convolution layers (all except the first one with stride 2) with a dropout layer (0.3) in between.
+
+__Upsampling module:__
+It consists of an upsampling layer and a convolution layer.
+
+__Localization module:__
+Completed by a 3x3 convolution and a 1x1 convolution.
+
+Instance normalization and Leaky reLU are used throughout the architecture. The model is compiled with dice coefficient loss and dice coefficient.
+
+
+### Dice Similarity Coefficient (DSC):
+is an ensemble similarity measure function, usually used to calculate the similarity of two samples. It is used as the loss function and validation segmentation image in this model.
+
+
+## Install: 
+``` 
+git clone https://github.com/shakes76/PatternFlow.git 
+```
+
+
+## Using: 
+!!! Before running it, you must set path_data to the path of the data folder in the dataset.py folder.
+
+When using this data for the first time, you need to use dataset.load_dataset(data_reshape = True), this setting will convert all non-uniform data pattern sizes into 256*256 size patterns and save them in a new folder "data_Reshape" in the "data" folder ". The data format is as follows:
+
+* data_Reshaped
+    * Train
+    * Train_GT
+    * Test
+    * Test_GT
+    * Val
+    * Val_GT
+
+!!! Always remember to use (data_reshape = False) if reshaped data already exists
+
+__run train.py to training and save the model (model save in the current folder):__
+(The specific code needs to be changed according to the specific device environment)
+```
+python3.9 train.py
+```
+
+__run predict.py to load the model and predict mask with validation data:__
+(The specific code needs to be changed according to the specific device environment)
+```
+python3.9 predict.py
+```
+
+## Testing and Conclusion: 
+Train and test data are use on training and testing the model.
+
+Run for 10 epochs will get (this is a very low epochs value
+, please select more in train.py if wants a better result):
+
+### model DSC:
+
+![image](./images/DSC.png)
+
+
+### model loss:
+
+![image](./images/loss.png)
+
+
+Evaluating the model through the val dataset, we can get the DSC data:
+tf.Tensor(0.83824617, shape=(), dtype=float32)
+
+The image folder contains the first 50 original images, predict mask , and ground truth comparison. You can also change the output image number by changing the number_list in predict.py.
+
+### Some example of images:
+![image](./images/output0.png)
+![image](./images/output5.png)
+![image](./images/output10.png)
+![image](./images/output13.png)
+![image](./images/output37.png)
+
+
+
+## Packages:
+os, cv2, skimage, tensorflow-macos_version_2.9.2
+
+SimpleITK_version_2.2.0
+-- SimpleITK is a simplified interface to the Insight Toolkit (ITK) for image registration and segmentation
+(http://simpleitk.org/)
+
+numpy_version_1.23.1
+-- NumPy is the fundamental package for array computing with Python.
+(https://www.numpy.org)
+
+pandas_version_1.4.4
+-- Powerful data structures for data analysis, time series, and statistics
+(https://pandas.pydata.org)
+
+matplotlib_version_3.5.2
+-- Python plotting package
+(https://matplotlib.org)
+
+
+## Reference
+
+Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., &amp; Maier-Hein, K. H. (2018, February 28). Brain tumor segmentation and radiomics survival prediction: Contribution to the brats 2017 challenge. arXiv.org. Retrieved October 19, 2022, from https://arxiv.org/abs/1802.10508v1 
diff --git a/recognition/s4627234_3710project/commit-history1.png b/recognition/s4627234_3710project/commit-history1.png
diff --git a/recognition/s4627234_3710project/commit-history2.png b/recognition/s4627234_3710project/commit-history2.png
diff --git a/recognition/s4627234_3710project/commit_history3.png b/recognition/s4627234_3710project/commit_history3.png
diff --git a/recognition/s4627234_3710project/dataset.py b/recognition/s4627234_3710project/dataset.py
@@ -0,0 +1,168 @@
+
+import os
+import cv2 as cv
+import SimpleITK as sitk
+import numpy as np
+import pandas as pd
+import matplotlib
+import matplotlib.pyplot as plt
+
+import skimage
+from skimage import io
+
+# ISIC data format using 2017 ISIC data
+
+# Because original data from ISIC has multiple different size, so we 
+# need to reshape them into a better size for doing ML
+# if data_reshape is True, then the program will create a new data folder in
+# the given direction and reshape all the ISIC images into given size
+# (.csv will still in the original position and will not change)
+
+"""
+    Change the path to the path of the IRIS folder, 
+    e.g. if the folder path is ./data/data_ISIC, use path_data = "./data"
+"""
+path_data = "/Users/davedai/Desktop/MySolution/data"
+
+def create_data(data_from, data_images, data_to, img_size = 256):
+    """ Create data image based on the given data image to the given direction.
+        new data should have the given image size.
+
+    Args:
+        data_from (String): From direction
+        data_images (list): list of images
+        data_to (String): To direction
+        img_size (int, optional): image size to be transformed. Defaults to 256.
+    """
+    for i in data_images:     
+        img=sitk.ReadImage(os.path.join(data_from,i))
+        img_array=sitk.GetArrayFromImage(img)
+        new_array=cv.resize(img_array,(img_size,img_size))
+        data_name = i[:-4] # removing last four (.jpg/.png/...)
+
+        io.imsave(data_to + data_name + '.png', new_array)
+
+
+def reshape_data():
+    """ 
+        Reshape image into given size and save into the new file
+    """
+    # training image path
+    train_path = path_data + '/data_ISIC/ISIC-2017_Training_Data/'
+    train = [fn for fn in os.listdir(train_path) if fn.endswith('jpg')]
+    train.sort()
+
+    # training ground truth path
+    train_path_gt = path_data + '/data_ISIC/ISIC-2017_Training_Part1_GroundTruth/'
+    train_gt = [fn for fn in os.listdir(train_path_gt) if fn.endswith('png')]
+    train_gt.sort()
+
+    # test image path
+    test_path = path_data + '/data_ISIC/ISIC-2017_Test_v2_Data'
+    test = [fn for fn in os.listdir(test_path) if fn.endswith('jpg')]
+    test.sort()
+
+    # test ground truth images
+    test_path_gt = path_data + '/data_ISIC/ISIC-2017_Test_v2_Part1_GroundTruth'
+    test_gt = [fn for fn in os.listdir(test_path_gt) if fn.endswith('png')]
+    test_gt.sort()
+
+    # validation image path
+    val_path = path_data + '/data_ISIC/ISIC-2017_Validation_Data'
+    val = [fn for fn in os.listdir(val_path) if fn.endswith('jpg')]
+    val.sort()
+
+    # validation image path
+    val_path_gt = path_data + '/data_ISIC/ISIC-2017_Validation_Part1_GroundTruth'
+    val_gt = [fn for fn in os.listdir(val_path_gt) if fn.endswith('png')]
+    val_gt.sort()
+
+    if not os.path.exists(path_data + '/data_Reshaped'):
+        os.mkdir(path_data + '/data_Reshaped/')
+        os.mkdir(path_data + '/data_Reshaped/Train')
+        os.mkdir(path_data + '/data_Reshaped/Train_GT')
+        os.mkdir(path_data + '/data_Reshaped/Test')
+        os.mkdir(path_data + '/data_Reshaped/Test_GT')
+        os.mkdir(path_data + '/data_Reshaped/Val')
+        os.mkdir(path_data + '/data_Reshaped/Val_GT')
+
+    create_data(train_path, train, (path_data + '/data_Reshaped/Train/'))
+    create_data(train_path_gt, train_gt, (path_data + '/data_Reshaped/Train_GT/'))
+    create_data(test_path, test, (path_data + '/data_Reshaped/Test/'))
+    create_data(test_path_gt, test_gt, (path_data + '/data_Reshaped/Test_GT/'))
+    create_data(val_path, val, (path_data + '/data_Reshaped/Val/'))
+    create_data(val_path_gt, val_gt, (path_data + '/data_Reshaped/Val_GT/'))
+
+
+def load_data(csv, path_image, path_image_gt):
+    """ load the data and its mask from the given path with the order in csv file
+        csv file only use for getting the names of the images
+
+    Args:
+        csv (pandas.core.frame.DataFrame): csv file by using pandas to read
+        path_image (String): image path 
+        path_image_gt (String): image ground truth path
+
+    Returns:
+        numpy.ndarray, numpy.ndarray: return numpy.ndarray of images and it's mask
+    """
+    x, y = [], []
+    for _, i in csv.iterrows():
+        image = sitk.ReadImage(path_image + i[0]+'.png')
+        image_array_ = sitk.GetArrayFromImage(image)
+        image_array = image_array_/255.0
+        x.append(image_array)
+
+        mask_ = cv.imread(path_image_gt + i[0]+'_segmentation.png')
+        mask = mask_/255.0
+        y.append(mask)
+
+    return np.array(x), np.array(y)
+
+
+def load_dataset(data_reshape = False):
+    """ Load the dataset, if the data need to reshape(data_reshape = True) then reshape the dataset
+
+    Args:
+        data_reshape (bool, optional): reshape the data if True. Defaults to False.
+
+    Returns:
+        numpy.ndarray, numpy.ndarray, numpy.ndarray, numpy.ndarray:
+            return the image and its mask image for all training and testing data 
+    """
+    if data_reshape: 
+        reshape_data()
+
+    train_csv = pd.read_csv(path_data + '/data_ISIC/ISIC-2017_Training_Data/ISIC-2017_Training_Data_metadata.csv')
+    test_csv = pd.read_csv(path_data + '/data_ISIC/ISIC-2017_Test_v2_Data/ISIC-2017_Test_v2_Data_metadata.csv')
+
+    path_train = path_data + '/data_Reshaped/Train/'
+    path_train_gt = path_data + '/data_Reshaped/Train_GT/'
+
+    path_test = path_data + '/data_Reshaped/Test/'
+    path_test_gt = path_data + '/data_Reshaped/Test_GT/'
+
+    train_x, train_y = load_data(train_csv, path_train, path_train_gt)
+    test_x, test_y = load_data(test_csv, path_test, path_test_gt)
+
+    return train_x, train_y, test_x, test_y
+
+
+
+
+def load_val():
+    """ Load the dataset
+
+    Returns:
+        numpy.ndarray, numpy.ndarray: return the image and its mask image for all val data 
+    """
+    val_csv = pd.read_csv(path_data + '/data_ISIC/ISIC-2017_Validation_Data/ISIC-2017_Validation_Data_metadata.csv')
+
+
+    path_val = path_data + '/data_Reshaped/Val/'
+    path_val_gt = path_data + '/data_Reshaped/Val_GT/'
+
+    val_x, val_y = load_data(val_csv, path_val, path_val_gt)
+
+    return val_x, val_y
+
diff --git a/recognition/s4627234_3710project/images/DSC.png b/recognition/s4627234_3710project/images/DSC.png
diff --git a/recognition/s4627234_3710project/images/UNet.png b/recognition/s4627234_3710project/images/UNet.png
diff --git a/recognition/s4627234_3710project/images/data_image_example.png b/recognition/s4627234_3710project/images/data_image_example.png
diff --git a/recognition/s4627234_3710project/images/loss.png b/recognition/s4627234_3710project/images/loss.png
diff --git a/recognition/s4627234_3710project/images/model256.png b/recognition/s4627234_3710project/images/model256.png
diff --git a/recognition/s4627234_3710project/images/model_imp.png b/recognition/s4627234_3710project/images/model_imp.png
diff --git a/recognition/s4627234_3710project/images/output0.png b/recognition/s4627234_3710project/images/output0.png
diff --git a/recognition/s4627234_3710project/images/output1.png b/recognition/s4627234_3710project/images/output1.png
diff --git a/recognition/s4627234_3710project/images/output10.png b/recognition/s4627234_3710project/images/output10.png
diff --git a/recognition/s4627234_3710project/images/output11.png b/recognition/s4627234_3710project/images/output11.png
diff --git a/recognition/s4627234_3710project/images/output12.png b/recognition/s4627234_3710project/images/output12.png
diff --git a/recognition/s4627234_3710project/images/output13.png b/recognition/s4627234_3710project/images/output13.png
diff --git a/recognition/s4627234_3710project/images/output14.png b/recognition/s4627234_3710project/images/output14.png
diff --git a/recognition/s4627234_3710project/images/output15.png b/recognition/s4627234_3710project/images/output15.png
diff --git a/recognition/s4627234_3710project/images/output16.png b/recognition/s4627234_3710project/images/output16.png
diff --git a/recognition/s4627234_3710project/images/output17.png b/recognition/s4627234_3710project/images/output17.png
diff --git a/recognition/s4627234_3710project/images/output18.png b/recognition/s4627234_3710project/images/output18.png
diff --git a/recognition/s4627234_3710project/images/output19.png b/recognition/s4627234_3710project/images/output19.png
diff --git a/recognition/s4627234_3710project/images/output2.png b/recognition/s4627234_3710project/images/output2.png
diff --git a/recognition/s4627234_3710project/images/output20.png b/recognition/s4627234_3710project/images/output20.png
diff --git a/recognition/s4627234_3710project/images/output21.png b/recognition/s4627234_3710project/images/output21.png
diff --git a/recognition/s4627234_3710project/images/output22.png b/recognition/s4627234_3710project/images/output22.png
diff --git a/recognition/s4627234_3710project/images/output23.png b/recognition/s4627234_3710project/images/output23.png
diff --git a/recognition/s4627234_3710project/images/output24.png b/recognition/s4627234_3710project/images/output24.png
diff --git a/recognition/s4627234_3710project/images/output25.png b/recognition/s4627234_3710project/images/output25.png
diff --git a/recognition/s4627234_3710project/images/output26.png b/recognition/s4627234_3710project/images/output26.png
diff --git a/recognition/s4627234_3710project/images/output27.png b/recognition/s4627234_3710project/images/output27.png
diff --git a/recognition/s4627234_3710project/images/output28.png b/recognition/s4627234_3710project/images/output28.png
diff --git a/recognition/s4627234_3710project/images/output29.png b/recognition/s4627234_3710project/images/output29.png
diff --git a/recognition/s4627234_3710project/images/output3.png b/recognition/s4627234_3710project/images/output3.png
diff --git a/recognition/s4627234_3710project/images/output30.png b/recognition/s4627234_3710project/images/output30.png
diff --git a/recognition/s4627234_3710project/images/output31.png b/recognition/s4627234_3710project/images/output31.png
diff --git a/recognition/s4627234_3710project/images/output32.png b/recognition/s4627234_3710project/images/output32.png
diff --git a/recognition/s4627234_3710project/images/output33.png b/recognition/s4627234_3710project/images/output33.png
diff --git a/recognition/s4627234_3710project/images/output34.png b/recognition/s4627234_3710project/images/output34.png
diff --git a/recognition/s4627234_3710project/images/output35.png b/recognition/s4627234_3710project/images/output35.png
diff --git a/recognition/s4627234_3710project/images/output36.png b/recognition/s4627234_3710project/images/output36.png
diff --git a/recognition/s4627234_3710project/images/output37.png b/recognition/s4627234_3710project/images/output37.png
diff --git a/recognition/s4627234_3710project/images/output38.png b/recognition/s4627234_3710project/images/output38.png
diff --git a/recognition/s4627234_3710project/images/output39.png b/recognition/s4627234_3710project/images/output39.png
diff --git a/recognition/s4627234_3710project/images/output4.png b/recognition/s4627234_3710project/images/output4.png
diff --git a/recognition/s4627234_3710project/images/output40.png b/recognition/s4627234_3710project/images/output40.png
diff --git a/recognition/s4627234_3710project/images/output41.png b/recognition/s4627234_3710project/images/output41.png
diff --git a/recognition/s4627234_3710project/images/output42.png b/recognition/s4627234_3710project/images/output42.png
diff --git a/recognition/s4627234_3710project/images/output43.png b/recognition/s4627234_3710project/images/output43.png
diff --git a/recognition/s4627234_3710project/images/output44.png b/recognition/s4627234_3710project/images/output44.png
diff --git a/recognition/s4627234_3710project/images/output45.png b/recognition/s4627234_3710project/images/output45.png
diff --git a/recognition/s4627234_3710project/images/output46.png b/recognition/s4627234_3710project/images/output46.png
diff --git a/recognition/s4627234_3710project/images/output47.png b/recognition/s4627234_3710project/images/output47.png
diff --git a/recognition/s4627234_3710project/images/output48.png b/recognition/s4627234_3710project/images/output48.png
diff --git a/recognition/s4627234_3710project/images/output49.png b/recognition/s4627234_3710project/images/output49.png
diff --git a/recognition/s4627234_3710project/images/output5.png b/recognition/s4627234_3710project/images/output5.png
diff --git a/recognition/s4627234_3710project/images/output6.png b/recognition/s4627234_3710project/images/output6.png
diff --git a/recognition/s4627234_3710project/images/output7.png b/recognition/s4627234_3710project/images/output7.png
diff --git a/recognition/s4627234_3710project/images/output8.png b/recognition/s4627234_3710project/images/output8.png
diff --git a/recognition/s4627234_3710project/images/output9.png b/recognition/s4627234_3710project/images/output9.png