From f51ff283570f3986c97a209ffd3d2ff74d12a378 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Thu, 13 Oct 2022 08:36:46 +1000 Subject: [PATCH 01/16] Created boilerplate files and folders Created file structure. Files currently empty --- .gitignore | 4 +- recognition/ISICs_UNet/README.md | 125 ++++++------------ recognition/XUE4645768/README.md | 46 +++++++ .../s4640439_siamese_network/README.MD | 1 + .../s4640439_siamese_network/dataset.py | 1 + .../s4640439_siamese_network/modules.py | 1 + .../s4640439_siamese_network/predict.py | 1 + recognition/s4640439_siamese_network/train.py | 1 + 8 files changed, 92 insertions(+), 88 deletions(-) create mode 100644 recognition/s4640439_siamese_network/README.MD create mode 100644 recognition/s4640439_siamese_network/dataset.py create mode 100644 recognition/s4640439_siamese_network/modules.py create mode 100644 recognition/s4640439_siamese_network/predict.py create mode 100644 recognition/s4640439_siamese_network/train.py diff --git a/.gitignore b/.gitignore index 92459a9d2f..78d62bebcc 100644 --- a/.gitignore +++ b/.gitignore @@ -129,4 +129,6 @@ dmypy.json .vscode/ # no tracking mypy config file -mypy.ini \ No newline at end of file +mypy.ini +recognition/XUE4645768/README.md +recognition/ISICs_UNet/README.md diff --git a/recognition/ISICs_UNet/README.md b/recognition/ISICs_UNet/README.md index 788ea17b79..f2c009212e 100644 --- a/recognition/ISICs_UNet/README.md +++ b/recognition/ISICs_UNet/README.md @@ -1,101 +1,52 @@ -# Segment the ISICs data set with the U-net +# Segmenting ISICs with U-Net -## Project Overview -This project aim to solve the segmentation of skin lesian (ISIC2018 data set) using the U-net, with all labels having a minimum Dice similarity coefficient of 0.7 on the test set[Task 3]. +COMP3710 Report recognition problem 3 (Segmenting ISICs data set with U-Net) solved in TensorFlow -## ISIC2018 -![ISIC example](imgs/example.jpg) +Created by Christopher Bailey (45576430) -Skin Lesion Analysis towards Melanoma Detection +## The problem and algorithm +The problem solved by this program is binary segmentation of the ISICs skin lesion data set. Segmentation is a way to label pixels in an image according to some grouping, in this case lesion or non-lesion. This translates images of skin to masks representing areas of concern for skin lesions. -Task found in https://challenge2018.isic-archive.com/ +U-Net is a form of autoencoder where the downsampling path is expected to learn the features of the image and the upsampling path learns how to recreate the masks. Long skip connections between downpooling and upsampling layers are utilised to overcome the bottleneck in traditional autoencoders allowing feature representations to be recreated. +## How it works +A four layer padded U-Net is used, preserving skin features and mask resolution. The implementation utilises Adam as the optimizer and implements Dice distance as the loss function as this appeared to give quicker convergence than other methods (eg. binary cross-entropy). -## U-net -![UNet](imgs/uent.png) +The utilised metric is a Dice coefficient implementation. My initial implementation appeared faulty and was replaced with a 3rd party implementation which appears correct. 3 epochs was observed to be generally sufficient to observe Dice coefficients of 0.8+ on test datasets but occasional non-convergence was observed and could be curbed by increasing the number of epochs. Visualisation of predictions is also implemented and shows reasonable correspondence. Orange bandaids represent an interesting challenge for the implementation as presented. -U-net is one of the popular image segmentation architectures used mostly in biomedical purposes. The name UNet is because it’s architecture contains a compressive path and an expansive path which can be viewed as a U shape. This architecture is built in such a way that it could generate better results even for a less number of training data sets. +### Training, validation and testing split +Training, validation and testing uses a respective 60:20:20 split, a commonly assumed starting point suggested by course staff. U-Net in particular was developed to work "with very few training images" (Ronneberger et al, 2015) The input data for this problem consists of 2594 images and masks. This split appears to provide satisfactory results. -## Data Set Structure +## Using the model +### Dependencies required +* Python3 (tested with 3.8) +* TensorFlow 2.x (tested with 2.3) +* glob (used to load filenames) +* matplotlib (used for visualisations, tested with 3.3) -data set folder need to be stored in same directory with structure same as below -```bash -ISIC2018 - |_ ISIC2018_Task1-2_Training_Input_x2 - |_ ISIC_0000000 - |_ ISIC_0000001 - |_ ... - |_ ISIC2018_Task1_Training_GroundTruth_x2 - |_ ISIC_0000000_segmentation - |_ ISIC_0000001_segmentation - |_ ... -``` +### Parameter tuning +The model was developed on a GTX 1660 TI (6GB VRAM) and certain values (notably batch size and image resolution) were set lower than might otherwise be ideal on more capable hardware. This is commented in the relevant code. -## Dice Coefficient +### Running the model +The model is executed via the main.py script. -The Sørensen–Dice coefficient is a statistic used to gauge the similarity of two samples. +### Example output +Given a batch size of 1 and 3 epochs the following output was observed on a single run: +Era | Loss | Dice coefficient +--- | ---- | ---------------- +Epoch 1 | 0.7433 | 0.2567 +Epoch 2 | 0.3197 | 0.6803 +Epoch 3 | 0.2657 | 0.7343 +Testing | 0.1820 | 0.8180 -Further information in https://en.wikipedia.org/wiki/S%C3%B8rensen%E2%80%93Dice_coefficient -## Dependencies +### Figure 1 - example visualisation plot +Skin images in left column, true mask middle, predicted mask right column +![Visualisation of predictions](visual.png) -- python 3 -- tensorflow 2.1.0 -- pandas 1.1.4 -- numpy 1.19.2 -- matplotlib 3.3.2 -- scikit-learn 0.23.2 -- pillow 8.0.1 - - -## Usages - -- Run `train.py` for training the UNet on ISIC data. -- Run `evaluation.py` for evaluation and case present. - -## Advance - -- Modify `setting.py` for custom setting, such as different batch size. -- Modify `unet.py` for custom UNet, such as different kernel size. - -## Algorithm - -- data set: - - The data set we used is the training set of ISIC 2018 challenge data which has segmentation labels. - - Training: Validation: Test = 1660: 415: 519 = 0.64: 0.16 : 0.2 (Training: Test = 4: 1 and in Training, further split 4: 1 for Training: Validation) - - Training data augmentations: rescale, rotate, shift, zoom, grayscale -- model: - - Original UNet with padding which can keep the shape of input and output same. - - The first convolutional layers has 16 output channels. - - The activation function of all convolutional layers is ELU. - - Without batch normalization layers. - - The inputs is (384, 512, 1) - - The output is (384, 512, 1) after sigmoid activation. - - Optimizer: Adam, lr = 1e-4 - - Loss: dice coefficient loss - - Metrics: accuracy & dice coefficient - -## Results - -Evaluation dice coefficient is 0.805256724357605. - -plot of train/valid Dice coefficient: - -![img](imgs/train_and_valid_dice_coef.png) - -case present: - -![case](imgs/case%20present.png) - -## Reference -Manna, S. (2020). K-Fold Cross Validation for Deep Learning using Keras. [online] Medium. Available at: https://medium.com/the-owl/k-fold-cross-validation-in-keras-3ec4a3a00538 [Accessed 24 Nov. 2020]. - -zhixuhao (2020). zhixuhao/unet. [online] GitHub. Available at: https://github.com/zhixuhao/unet. - -GitHub. (n.d.). NifTK/NiftyNet. [online] Available at: https://github.com/NifTK/NiftyNet/blob/a383ba342e3e38a7ad7eed7538bfb34960f80c8d/niftynet/layer/loss_segmentation.py [Accessed 24 Nov. 2020]. - -Team, K. (n.d.). Keras documentation: Losses. [online] keras.io. Available at: https://keras.io/api/losses/#creating-custom-losses [Accessed 24 Nov. 2020]. - -262588213843476 (n.d.). unet.py. [online] Gist. Available at: https://gist.github.com/abhinavsagar/fe0c900133cafe93194c069fe655ef6e [Accessed 24 Nov. 2020]. - -Stack Overflow. (n.d.). python - Disable Tensorflow debugging information. [online] Available at: https://stackoverflow.com/questions/35911252/disable-tensorflow-debugging-information [Accessed 24 Nov. 2020]. +## References +Segments of code in this assignment were used from or based on the following sources: +1. COMP3710-demo-code.ipynb from Guest Lecture +1. https://www.tensorflow.org/tutorials/load_data/images +1. https://www.tensorflow.org/guide/gpu +1. Karan Jakhar (2019) https://medium.com/@karan_jakhar/100-days-of-code-day-7-84e4918cb72c diff --git a/recognition/XUE4645768/README.md b/recognition/XUE4645768/README.md index 36250adaa3..94bc1848c0 100644 --- a/recognition/XUE4645768/README.md +++ b/recognition/XUE4645768/README.md @@ -53,6 +53,52 @@ python gcn.py Warning: Please pay attention to whether the data path is correct when you run the gcn.py. +# Training + +Learning rate= 0.01 +Weight dacay =0.005 + +For 200 epoches: +```Epoch 000: Loss 0.2894, TrainAcc 0.9126, ValAcc 0.8954 +Epoch 001: Loss 0.2880, TrainAcc 0.9126, ValAcc 0.895 +Epoch 002: Loss 0.2866, TrainAcc 0.9126, ValAcc 0.8961 +Epoch 003: Loss 0.2853, TrainAcc 0.9132, ValAcc 0.8961 +Epoch 004: Loss 0.2839, TrainAcc 0.9137, ValAcc 0.8961 +Epoch 005: Loss 0.2826, TrainAcc 0.9141, ValAcc 0.8963 +Epoch 006: Loss 0.2813, TrainAcc 0.9146, ValAcc 0.8956 +Epoch 007: Loss 0.2800, TrainAcc 0.9146, ValAcc 0.8956 +Epoch 008: Loss 0.2788, TrainAcc 0.9146, ValAcc 0.8959 +Epoch 009: Loss 0.2775, TrainAcc 0.9146, ValAcc 0.8970 +Epoch 010: Loss 0.2763, TrainAcc 0.915, ValAcc 0.8974 +Epoch 011: Loss 0.2751, TrainAcc 0.915, ValAcc 0.8972 +Epoch 012: Loss 0.2739, TrainAcc 0.915, ValAcc 0.8976 +Epoch 013: Loss 0.2727, TrainAcc 0.9157, ValAcc 0.8979 +Epoch 014: Loss 0.2716, TrainAcc 0.9157, ValAcc 0.8983 +Epoch 015: Loss 0.2704, TrainAcc 0.9161, ValAcc 0.8990 +Epoch 016: Loss 0.2693, TrainAcc 0.9168, ValAcc 0.8988 +Epoch 017: Loss 0.2682, TrainAcc 0.9181, ValAcc 0.8990 +Epoch 018: Loss 0.2671, TrainAcc 0.9179, ValAcc 0.8990 +Epoch 019: Loss 0.2660, TrainAcc 0.9179, ValAcc 0.8992 +Epoch 020: Loss 0.2650, TrainAcc 0.9188, ValAcc 0.8996 +...... +Epoch 190: Loss 0.1623, TrainAcc 0.9553, ValAcc 0.9134 +Epoch 191: Loss 0.1619, TrainAcc 0.9555, ValAcc 0.9134 +Epoch 192: Loss 0.1615, TrainAcc 0.9555, ValAcc 0.9132 +Epoch 193: Loss 0.1611, TrainAcc 0.9557, ValAcc 0.9130 +Epoch 194: Loss 0.1607, TrainAcc 0.9562, ValAcc 0.9130 +Epoch 195: Loss 0.1603, TrainAcc 0.9559, ValAcc 0.9130 +Epoch 196: Loss 0.1599, TrainAcc 0.9562, ValAcc 0.9126 +Epoch 197: Loss 0.1595, TrainAcc 0.9562, ValAcc 0.9123 +Epoch 198: Loss 0.1591, TrainAcc 0.9562, ValAcc 0.9123 +Epoch 199: Loss 0.1587, TrainAcc 0.9562, ValAcc 0.9123``` + +For test accuracy:around 0.9 + +# TSNE +For the test:iteration=500, with lower dimension to 2 + + + ```python diff --git a/recognition/s4640439_siamese_network/README.MD b/recognition/s4640439_siamese_network/README.MD new file mode 100644 index 0000000000..30404ce4c5 --- /dev/null +++ b/recognition/s4640439_siamese_network/README.MD @@ -0,0 +1 @@ +TODO \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/dataset.py b/recognition/s4640439_siamese_network/dataset.py new file mode 100644 index 0000000000..f87f5c14cb --- /dev/null +++ b/recognition/s4640439_siamese_network/dataset.py @@ -0,0 +1 @@ +# TODO \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py new file mode 100644 index 0000000000..f87f5c14cb --- /dev/null +++ b/recognition/s4640439_siamese_network/modules.py @@ -0,0 +1 @@ +# TODO \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/predict.py b/recognition/s4640439_siamese_network/predict.py new file mode 100644 index 0000000000..f87f5c14cb --- /dev/null +++ b/recognition/s4640439_siamese_network/predict.py @@ -0,0 +1 @@ +# TODO \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py new file mode 100644 index 0000000000..f87f5c14cb --- /dev/null +++ b/recognition/s4640439_siamese_network/train.py @@ -0,0 +1 @@ +# TODO \ No newline at end of file From abfbd5d589bc240f8d3a9c1363b9c0f225fd2c80 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Thu, 13 Oct 2022 11:36:30 +1000 Subject: [PATCH 02/16] Created image pre-processing Created a function to load the images from a directory and convert to a numpy array --- .../s4640439_siamese_network/dataset.py | 56 ++++++++++++++++++- .../s4640439_siamese_network/modules.py | 4 +- 2 files changed, 58 insertions(+), 2 deletions(-) diff --git a/recognition/s4640439_siamese_network/dataset.py b/recognition/s4640439_siamese_network/dataset.py index f87f5c14cb..719c62868c 100644 --- a/recognition/s4640439_siamese_network/dataset.py +++ b/recognition/s4640439_siamese_network/dataset.py @@ -1 +1,55 @@ -# TODO \ No newline at end of file +import numpy as np +from PIL import Image +import os +import time + +# Data has already been separated into training and test data +AD_TEST_PATH = "E:/ADNI/AD_NC/test/AD/" +AD_TRAIN_PATH = "E:/ADNI/AD_NC/train/AD/" +NC_TEST_PATH = "E:/ADNI/AD_NC/test/NC/" +NC_TRAIN_PATH = "E:/ADNI/AD_NC/train/NC/" + +# image constants +WIDTH = 256 +HEIGHT = 240 +CHANNELS = 1 + +PRE_PROC_DATA_SAVE_LOC = "E:/ADNI/Processed" + +def load_data(directory_path, prefix): + save_path = os.path.join(PRE_PROC_DATA_SAVE_LOC, f"{prefix}_preprocessed.npy") + + if not os.path.isfile(save_path): + start = time.time() + print("Processing data for file", save_path) + + data = [] + + for filename in os.listdir(directory_path): + path = os.path.join(directory_path, filename) + + img = Image.open(path) + img_arr = np.asarray(img).astype(np.float32) + + # normalise + img_arr = img_arr / 127.5 - 1 + data.append(img_arr) + + data = np.reshape(data, (-1, HEIGHT, WIDTH, CHANNELS)) + + print("Saving data") + np.save(save_path, data) + + elapsed = time.time() - start + print (f'Image preprocess time: {elapsed}') + + else: + print("Loading preprocessed data") + data = np.load(save_path) + + return data + +#training_data_positive = load_data(AD_TRAIN_PATH, "ad_train") +#training_data_negative = load_data(NC_TRAIN_PATH, "nc_train") +#testing_data_positive = load_data(AD_TEST_PATH, "ad_test") +#testing_data_negative = load_data(NC_TEST_PATH, "nc_test") \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index f87f5c14cb..935117c883 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -1 +1,3 @@ -# TODO \ No newline at end of file +# Generate Siamese model + +# Generate binary classifier \ No newline at end of file From e0769e3f551daf7ee52e5ae4ab154b914464633c Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Tue, 18 Oct 2022 09:30:43 +1000 Subject: [PATCH 03/16] Defined model generator functions - Created the functions necessary for building the NNs that will be used for training and classification. - Siamese model needs to be tweaked for performance and output size - Binary classifier needs to be built as it only has the barebones --- .../s4640439_siamese_network/modules.py | 48 ++++++++++++++++++- 1 file changed, 46 insertions(+), 2 deletions(-) diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index 935117c883..83bb26dde8 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -1,3 +1,47 @@ -# Generate Siamese model +import tensorflow as tf +from tensorflow.keras.models import Sequential +from tensorflow.keras.layers import Conv2D, LeakyReLU, Flatten, MaxPool2d -# Generate binary classifier \ No newline at end of file +IMAGE_SIZE = (240,256,1) +ALPHA = 0.2 + +def build_siamese(): + """ + Generate Siamese model + This model needs to be a CNN that reduces an image to a vector + """ + model = Sequential() + + model.add(Conv2D(32, kernel_size=3, input_shape=IMAGE_SIZE)) + model.add(LeakyReLU(alpha=ALPHA)) + + model.add(MaxPool2d(pool_size=(2,2), strides=(1, 1))) + + model.add(Conv2D(64, kernel_size=3)) + model.add(LeakyReLU(alpha=ALPHA)) + + model.add(MaxPool2d(pool_size=(2,2), strides=(1, 1))) + + model.add(Conv2D(128, kernel_size=3)) + model.add(LeakyReLU(alpha=ALPHA)) + + model.add(MaxPool2d(pool_size=(2,2), strides=(1, 1))) + + model.add(Flatten()) + + return model + + + +def build_binary(): + """ + Generate binary classifier + This model needs to be a binary classifier that takes an output vector from + siamese model and converts it into a single value in the range [0,1] + """ + + # TODO: define layers of model + + model = Sequential() + + return model \ No newline at end of file From 31f021541b21323bbbb49ae3753443f34a6b96aa Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Tue, 18 Oct 2022 09:44:42 +1000 Subject: [PATCH 04/16] Added docstrings to load_data function Added documentation to load_data function in dataset.py --- .../s4640439_siamese_network/dataset.py | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/recognition/s4640439_siamese_network/dataset.py b/recognition/s4640439_siamese_network/dataset.py index 719c62868c..f6ee3190ec 100644 --- a/recognition/s4640439_siamese_network/dataset.py +++ b/recognition/s4640439_siamese_network/dataset.py @@ -16,15 +16,30 @@ PRE_PROC_DATA_SAVE_LOC = "E:/ADNI/Processed" -def load_data(directory_path, prefix): +def load_data(directory_path: str, prefix: str) -> np.ndarray: + """ + Processes and saves image data as a numpy array. + + Attempts to find pre-processed data and load it from a save. + If a save cannot be found, processes the data. + + Parameters: + - directory_path: Path to folder containing images to process + - prefix: String representing data type. Used for save filename + + Returns: + - processed image dataset as numpy array. + """ save_path = os.path.join(PRE_PROC_DATA_SAVE_LOC, f"{prefix}_preprocessed.npy") if not os.path.isfile(save_path): + # save cannot be found start = time.time() print("Processing data for file", save_path) data = [] + # loop through and process images for filename in os.listdir(directory_path): path = os.path.join(directory_path, filename) @@ -44,6 +59,7 @@ def load_data(directory_path, prefix): print (f'Image preprocess time: {elapsed}') else: + # save found print("Loading preprocessed data") data = np.load(save_path) From 176701884cf46d43bd7d7d94c4701079ea1cc0f2 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Tue, 18 Oct 2022 18:27:13 +1000 Subject: [PATCH 05/16] Began implementing training code Created train_step function Various other documentation updates --- .../s4640439_siamese_network/README.MD | 11 ++++- .../s4640439_siamese_network/dataset.py | 4 ++ .../s4640439_siamese_network/modules.py | 24 ++++++++++ .../s4640439_siamese_network/predict.py | 15 ++++++- recognition/s4640439_siamese_network/train.py | 44 ++++++++++++++++++- 5 files changed, 95 insertions(+), 3 deletions(-) diff --git a/recognition/s4640439_siamese_network/README.MD b/recognition/s4640439_siamese_network/README.MD index 30404ce4c5..e156db5911 100644 --- a/recognition/s4640439_siamese_network/README.MD +++ b/recognition/s4640439_siamese_network/README.MD @@ -1 +1,10 @@ -TODO \ No newline at end of file +Requirements +1. The readme file should contain a title, a description of the algorithm and the problem that it solves +(approximately a paragraph), how it works in a paragraph and a figure/visualisation. +2. It should also list any dependencies required, including versions and address reproduciblility of results, +if applicable. +3. provide example inputs, outputs and plots of your algorithm +4. The read me file should be properly formatted using GitHub markdown +5. Describe any specific pre-processing you have used with references if any. Justify your training, validation +and testing splits of the data. + diff --git a/recognition/s4640439_siamese_network/dataset.py b/recognition/s4640439_siamese_network/dataset.py index f6ee3190ec..8fbf82c5df 100644 --- a/recognition/s4640439_siamese_network/dataset.py +++ b/recognition/s4640439_siamese_network/dataset.py @@ -3,6 +3,10 @@ import os import time +""" +Containing the data loader for loading and preprocessing your data. +""" + # Data has already been separated into training and test data AD_TEST_PATH = "E:/ADNI/AD_NC/test/AD/" AD_TRAIN_PATH = "E:/ADNI/AD_NC/train/AD/" diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index 83bb26dde8..992fbbddb4 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -5,6 +5,11 @@ IMAGE_SIZE = (240,256,1) ALPHA = 0.2 +""" +Containing the source code of the components of your model. +Each component must be implementated as a class or a function. +""" + def build_siamese(): """ Generate Siamese model @@ -31,6 +36,25 @@ def build_siamese(): return model +def siamese_loss(x0, x1, y: int) -> float: + """ + Custom loss function for siamese network. + + Takes two vectors, then calculates their distance. + + Vectors of the same class are rewarded for being close and punished for being far away. + Vectors of different classes are punished for being close and rewarded for being far away. + + Parameters: + - x0 -- first vector + - x1 -- second vector + - y -- integer representing whether or not the two vectors are from the same class + + Returns: + - loss value + """ + # TODO + return 0 def build_binary(): diff --git a/recognition/s4640439_siamese_network/predict.py b/recognition/s4640439_siamese_network/predict.py index f87f5c14cb..318940a32f 100644 --- a/recognition/s4640439_siamese_network/predict.py +++ b/recognition/s4640439_siamese_network/predict.py @@ -1 +1,14 @@ -# TODO \ No newline at end of file +""" +Showing example usage of your trained model +Print out any results and / or provide visualisations where applicable +""" + +from train import * +from dataset import * + +""" +TODO: +Load saved binary classification model from train.py +Load test data using dataset.py +Test the data and print/plot results +""" \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index f87f5c14cb..2a870ed7f6 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -1 +1,43 @@ -# TODO \ No newline at end of file +import tensorflow as tf + +from modules import * +from dataset import * + +""" +Containing the source code for training, validating, testing and saving your model. +The model should be imported from “modules.py” and the data loader should be imported from “dataset.py” +Make sure to plot the losses and metrics during training. + +""" + +@tf.function +def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): + """ + Executes one step of training the siamese model. + Backpropogates to update weightings. + + Parameters: + - siamese -- the siamese network + - siamese_optimiser -- the optimiser which will be used for backprop + - images1, images2 -- batch of image data which is either positive or negative + - same_class -- flag representing whether the two sets of images are of the same class + + Returns: + - loss value from this the training step + + """ + with tf.GradientTape() as siamese_tape: + + x0 = siamese(images1, training=True) + x1 = siamese(images2, training=True) + y = int(same_class) + + loss = siamese_loss(x0, x1, y) + + siamese_gradients = siamese_tape.gradient(\ + loss, siamese.trainable_variables) + + siamese_optimiser.apply_gradients(zip( + siamese_gradients, siamese.trainable_variables)) + + return loss \ No newline at end of file From c422b405cc054b2a7658956a18ab6995d4e15e5a Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Tue, 18 Oct 2022 18:49:42 +1000 Subject: [PATCH 06/16] Created main method in train.py Further built out functionality of train.py Laid groundwork for README --- .../s4640439_siamese_network/README.MD | 14 +++++++++++ recognition/s4640439_siamese_network/train.py | 24 ++++++++++++++++++- 2 files changed, 37 insertions(+), 1 deletion(-) diff --git a/recognition/s4640439_siamese_network/README.MD b/recognition/s4640439_siamese_network/README.MD index e156db5911..be89b9d102 100644 --- a/recognition/s4640439_siamese_network/README.MD +++ b/recognition/s4640439_siamese_network/README.MD @@ -8,3 +8,17 @@ if applicable. 5. Describe any specific pre-processing you have used with references if any. Justify your training, validation and testing splits of the data. +# Siamese Networks for Alzheimer's Disease Classification Using MRI Images + +## Description and Problem + +## How the Algorithm Works + +## Results + +## Running the Code +### Dependencies + +### Dataset and Pre-processing + +### Example Usage \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 2a870ed7f6..8b4949bac4 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -10,6 +10,10 @@ """ +EPOCHS = 100 +BATCH_SIZE = 32 +BUFFER_SIZE = 20000 + @tf.function def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): """ @@ -40,4 +44,22 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): siamese_optimiser.apply_gradients(zip( siamese_gradients, siamese.trainable_variables)) - return loss \ No newline at end of file + return loss + +def main(): + # get training data + training_data_positive = load_data(AD_TRAIN_PATH, "ad_train") + training_data_negative = load_data(NC_TRAIN_PATH, "nc_train") + + # convert to tensors + train_data_pos = tf.data.Dataset.from_tensor_slices(training_data_positive + ).shuffle(BUFFER_SIZE, reshuffle_each_iteration=True).batch(BATCH_SIZE) + train_data_neg = tf.data.Dataset.from_tensor_slices(training_data_negative + ).shuffle(BUFFER_SIZE, reshuffle_each_iteration=True).batch(BATCH_SIZE) + + # build models + siamese_model = build_siamese() + binary_classifier = build_binary() + +if __name__ == "__main__": + train() \ No newline at end of file From 0a0fa3753387db38ed5e83d05c199f1434f6309e Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Tue, 18 Oct 2022 20:02:12 +1000 Subject: [PATCH 07/16] Further implementation in training pipeline - Moved loss function from modules to train - Defined training function for siamese - Fleshed out main function --- recognition/s4640439_siamese_network/train.py | 76 ++++++++++++++++++- 1 file changed, 74 insertions(+), 2 deletions(-) diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 8b4949bac4..6ddc38a4ca 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -2,6 +2,8 @@ from modules import * from dataset import * +import time +import os """ Containing the source code for training, validating, testing and saving your model. @@ -9,11 +11,32 @@ Make sure to plot the losses and metrics during training. """ - EPOCHS = 100 BATCH_SIZE = 32 BUFFER_SIZE = 20000 +MODEL_SAVE_DIR = "E:/ADNI/models" + +def siamese_loss(x0, x1, y: int) -> float: + """ + Custom loss function for siamese network. + + Takes two vectors, then calculates their distance. + + Vectors of the same class are rewarded for being close and punished for being far away. + Vectors of different classes are punished for being close and rewarded for being far away. + + Parameters: + - x0 -- first vector + - x1 -- second vector + - y -- integer representing whether or not the two vectors are from the same class + + Returns: + - loss value + """ + # TODO + return 0 + @tf.function def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): """ @@ -46,6 +69,44 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): return loss +def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): + start = time.time() + print("Beginning Siamese Network Training") + + for epoch in range(epochs): + epoch_start = time.time() + + i = 0 + for pos_batch, neg_batch in zip(pos_dataset, neg_dataset): + # alternate between same-same training and same-diff training + if i % 2 == 0: + # same training + same_class = True + + #split batches + pos1, pos2 = tf.split(pos_batch, num_or_size_splits=2) + neg1, neg2 = tf.split(neg_batch, num_or_size_splits=2) + + pos_loss = train_step(model, optimiser, pos1, pos2 , same_class) + neg_loss = train_step(model, optimiser, neg1, neg2 , same_class) + + else: + # diff training + same_class = False + diff_loss = train_step(model, optimiser, pos_batch, neg_batch, same_class) + + i += 1 + + epoch_elapsed = time.time() - epoch_start + print(f"Epoch {i} - training time: {epoch_elapsed}") + + elapsed = time.time() - start + print(f"Siamese Network Training Completed in {elapsed}") + + +def train_binary_classifier(model, optimiser, siamese_model, pos_dataset, neg_dataset, epochs): + pass + def main(): # get training data training_data_positive = load_data(AD_TRAIN_PATH, "ad_train") @@ -60,6 +121,17 @@ def main(): # build models siamese_model = build_siamese() binary_classifier = build_binary() + + siamese_optimiser = tf.keras.optimizers.Adam(1.5e-4,0.5) + classifier_optimiser = tf.keras.optimizers.Adam(1.5e-4,0.5) + + train_siamese_model(siamese_model, siamese_optimiser, train_data_pos, train_data_neg, EPOCHS) + train_binary_classifier(binary_classifier, classifier_optimiser, siamese_model, + train_data_pos, train_data_neg, EPOCHS) + + siamese_model.save(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) + binary_classifier.save(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) + if __name__ == "__main__": - train() \ No newline at end of file + main() \ No newline at end of file From cc3adf7cb2a0bca9ce4369f2c86afc1170c9a679 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Thu, 20 Oct 2022 09:53:40 +1000 Subject: [PATCH 08/16] Implemented Binary Classifier - Created binary classifier function in modules.py - Created train_binary_classifier function in train.py --- .../s4640439_siamese_network/modules.py | 31 ++++++------------- recognition/s4640439_siamese_network/train.py | 29 ++++++++++++++--- 2 files changed, 33 insertions(+), 27 deletions(-) diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index 992fbbddb4..4b81419c8f 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -1,10 +1,12 @@ import tensorflow as tf from tensorflow.keras.models import Sequential -from tensorflow.keras.layers import Conv2D, LeakyReLU, Flatten, MaxPool2d +from tensorflow.keras.layers import Conv2D, LeakyReLU, Flatten, MaxPool2d, Dense IMAGE_SIZE = (240,256,1) ALPHA = 0.2 +SIAMESE_OUTPUT_SHAPE = (512,) + """ Containing the source code of the components of your model. Each component must be implementated as a class or a function. @@ -36,27 +38,6 @@ def build_siamese(): return model -def siamese_loss(x0, x1, y: int) -> float: - """ - Custom loss function for siamese network. - - Takes two vectors, then calculates their distance. - - Vectors of the same class are rewarded for being close and punished for being far away. - Vectors of different classes are punished for being close and rewarded for being far away. - - Parameters: - - x0 -- first vector - - x1 -- second vector - - y -- integer representing whether or not the two vectors are from the same class - - Returns: - - loss value - """ - # TODO - return 0 - - def build_binary(): """ Generate binary classifier @@ -68,4 +49,10 @@ def build_binary(): model = Sequential() + model.add(Dense(32, input_shape=SIAMESE_OUTPUT_SHAPE, activation="relu")) + model.add(Dense(8, activation="relu")) + model.add(Dense(1, activation="sigmoid")) + + model.compile(loss="binary_crossentropy", optimiser="adam", metrics=["accuracy"]) + return model \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 6ddc38a4ca..6e63b665e0 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -21,6 +21,8 @@ def siamese_loss(x0, x1, y: int) -> float: """ Custom loss function for siamese network. + Based on contrastive loss. + Takes two vectors, then calculates their distance. Vectors of the same class are rewarded for being close and punished for being far away. @@ -103,9 +105,28 @@ def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): elapsed = time.time() - start print(f"Siamese Network Training Completed in {elapsed}") +def train_binary_classifier(model, siamese_model, pos_dataset, neg_dataset, epochs): + start = time.time() + print("Beginning Binary Classifier Training") + + for epoch in range(epochs): + epoch_start = time.time() + + for pos_batch, neg_batch in zip(pos_dataset, neg_dataset): + transformed_pos = siamese_model(pos_batch, training=False) + transformed_neg = siamese_model(neg_batch, training=False) + + pos_labels = tf.ones_like(transformed_pos) + neg_labels = tf.zeros_like(transformed_neg) + + model.fit(transformed_pos, pos_labels) + model.fit(transformed_neg, neg_labels) -def train_binary_classifier(model, optimiser, siamese_model, pos_dataset, neg_dataset, epochs): - pass + epoch_elapsed = time.time() - epoch_start + print(f"Epoch {i} - training time: {epoch_elapsed}") + + elapsed = time.time() - start + print(f"Binary Classifier Training Completed in {elapsed}") def main(): # get training data @@ -123,11 +144,9 @@ def main(): binary_classifier = build_binary() siamese_optimiser = tf.keras.optimizers.Adam(1.5e-4,0.5) - classifier_optimiser = tf.keras.optimizers.Adam(1.5e-4,0.5) train_siamese_model(siamese_model, siamese_optimiser, train_data_pos, train_data_neg, EPOCHS) - train_binary_classifier(binary_classifier, classifier_optimiser, siamese_model, - train_data_pos, train_data_neg, EPOCHS) + train_binary_classifier(binary_classifier, siamese_model, train_data_pos, train_data_neg, EPOCHS) siamese_model.save(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) binary_classifier.save(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) From b650559195dd8569b6890faad72f3603005d83fd Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Thu, 20 Oct 2022 12:55:34 +1000 Subject: [PATCH 09/16] Defined siamese loss function - Defined siamese loss function - Other minor changes --- .../s4640439_siamese_network/modules.py | 5 +--- recognition/s4640439_siamese_network/train.py | 28 +++++++++++-------- 2 files changed, 18 insertions(+), 15 deletions(-) diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index 4b81419c8f..863271e0a5 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -44,15 +44,12 @@ def build_binary(): This model needs to be a binary classifier that takes an output vector from siamese model and converts it into a single value in the range [0,1] """ - - # TODO: define layers of model - model = Sequential() model.add(Dense(32, input_shape=SIAMESE_OUTPUT_SHAPE, activation="relu")) model.add(Dense(8, activation="relu")) model.add(Dense(1, activation="sigmoid")) - model.compile(loss="binary_crossentropy", optimiser="adam", metrics=["accuracy"]) + model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]) return model \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 6e63b665e0..06032546fc 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -11,13 +11,14 @@ Make sure to plot the losses and metrics during training. """ -EPOCHS = 100 +EPOCHS = 40 BATCH_SIZE = 32 BUFFER_SIZE = 20000 +MARGIN = 0.2 MODEL_SAVE_DIR = "E:/ADNI/models" -def siamese_loss(x0, x1, y: int) -> float: +def siamese_loss(x0, x1, label: int, margin: float) -> float: """ Custom loss function for siamese network. @@ -29,15 +30,20 @@ def siamese_loss(x0, x1, y: int) -> float: Vectors of different classes are punished for being close and rewarded for being far away. Parameters: - - x0 -- first vector - - x1 -- second vector - - y -- integer representing whether or not the two vectors are from the same class + - x0 -- batch of vectors + - x1 -- batch of vectors + - label -- whether or not the two vectors are from the same class. 1 = yes, 0 = no Returns: - loss value """ - # TODO - return 0 + dist = tf.reduce_sum(tf.square(x0 - x1), 1) + dist_sqrt = tf.sqrt(dist) + + loss = label * tf.square(tf.maximum(0., margin - dist_sqrt)) + (1 - label) * dist + loss = 0.5 * tf.reduce_mean(loss) + + return loss @tf.function def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): @@ -59,9 +65,9 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): x0 = siamese(images1, training=True) x1 = siamese(images2, training=True) - y = int(same_class) + label = int(same_class) - loss = siamese_loss(x0, x1, y) + loss = siamese_loss(x0, x1, label, MARGIN) siamese_gradients = siamese_tape.gradient(\ loss, siamese.trainable_variables) @@ -100,7 +106,7 @@ def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): i += 1 epoch_elapsed = time.time() - epoch_start - print(f"Epoch {i} - training time: {epoch_elapsed}") + print(f"Epoch {epoch} - training time: {epoch_elapsed}") elapsed = time.time() - start print(f"Siamese Network Training Completed in {elapsed}") @@ -123,7 +129,7 @@ def train_binary_classifier(model, siamese_model, pos_dataset, neg_dataset, epoc model.fit(transformed_neg, neg_labels) epoch_elapsed = time.time() - epoch_start - print(f"Epoch {i} - training time: {epoch_elapsed}") + print(f"Epoch {epoch} - training time: {epoch_elapsed}") elapsed = time.time() - start print(f"Binary Classifier Training Completed in {elapsed}") From c9530afae75f32c3dad91c236741e069644b6e97 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Thu, 20 Oct 2022 17:10:58 +1000 Subject: [PATCH 10/16] Finished first pass of training - Got training working for both models - Model tuning still required - Starting to fill out predict.py --- .../s4640439_siamese_network/predict.py | 16 ++++++++++++- recognition/s4640439_siamese_network/train.py | 23 +++++++++++-------- 2 files changed, 29 insertions(+), 10 deletions(-) diff --git a/recognition/s4640439_siamese_network/predict.py b/recognition/s4640439_siamese_network/predict.py index 318940a32f..d472818752 100644 --- a/recognition/s4640439_siamese_network/predict.py +++ b/recognition/s4640439_siamese_network/predict.py @@ -11,4 +11,18 @@ Load saved binary classification model from train.py Load test data using dataset.py Test the data and print/plot results -""" \ No newline at end of file +""" +def load_test_data(): + pass + +def load_classifier(): + pass + +def load_siamese(): + pass + +def main(): + pass + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 06032546fc..f118d90169 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -12,7 +12,7 @@ """ EPOCHS = 40 -BATCH_SIZE = 32 +BATCH_SIZE = 128 BUFFER_SIZE = 20000 MARGIN = 0.2 @@ -75,7 +75,7 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): siamese_optimiser.apply_gradients(zip( siamese_gradients, siamese.trainable_variables)) - return loss + return loss def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): start = time.time() @@ -84,8 +84,14 @@ def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): for epoch in range(epochs): epoch_start = time.time() - i = 0 + i = 1 for pos_batch, neg_batch in zip(pos_dataset, neg_dataset): + if i % 20 == 0: + print("-----------------------") + print("Batch number", i, "complete") + print(f"{i} batches completed in {time.time() - epoch_start}") + print(f"Avg batch time: {(time.time() - epoch_start) / i}") + # alternate between same-same training and same-diff training if i % 2 == 0: # same training @@ -122,8 +128,8 @@ def train_binary_classifier(model, siamese_model, pos_dataset, neg_dataset, epoc transformed_pos = siamese_model(pos_batch, training=False) transformed_neg = siamese_model(neg_batch, training=False) - pos_labels = tf.ones_like(transformed_pos) - neg_labels = tf.zeros_like(transformed_neg) + pos_labels = tf.ones_like(transformed_pos[1]) + neg_labels = tf.zeros_like(transformed_neg[1]) model.fit(transformed_pos, pos_labels) model.fit(transformed_neg, neg_labels) @@ -141,15 +147,15 @@ def main(): # convert to tensors train_data_pos = tf.data.Dataset.from_tensor_slices(training_data_positive - ).shuffle(BUFFER_SIZE, reshuffle_each_iteration=True).batch(BATCH_SIZE) + ).shuffle(BUFFER_SIZE, reshuffle_each_iteration=True).batch(BATCH_SIZE, drop_remainder=True) train_data_neg = tf.data.Dataset.from_tensor_slices(training_data_negative - ).shuffle(BUFFER_SIZE, reshuffle_each_iteration=True).batch(BATCH_SIZE) + ).shuffle(BUFFER_SIZE, reshuffle_each_iteration=True).batch(BATCH_SIZE, drop_remainder=True) # build models siamese_model = build_siamese() binary_classifier = build_binary() - siamese_optimiser = tf.keras.optimizers.Adam(1.5e-4,0.5) + siamese_optimiser = tf.keras.optimizers.Adam(0.05) train_siamese_model(siamese_model, siamese_optimiser, train_data_pos, train_data_neg, EPOCHS) train_binary_classifier(binary_classifier, siamese_model, train_data_pos, train_data_neg, EPOCHS) @@ -157,6 +163,5 @@ def main(): siamese_model.save(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) binary_classifier.save(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) - if __name__ == "__main__": main() \ No newline at end of file From c043eb9d7fabcb10368c7985c88fdf78c86ae008 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Fri, 21 Oct 2022 09:55:51 +1000 Subject: [PATCH 11/16] Edited Binary Classifier - changed structure of binary NN - changed binary classifier training function to feed in entire dataset rather than pre-batched --- .../s4640439_siamese_network/modules.py | 5 ++-- recognition/s4640439_siamese_network/train.py | 23 ++++++++----------- 2 files changed, 12 insertions(+), 16 deletions(-) diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index 863271e0a5..26320feed2 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -46,8 +46,9 @@ def build_binary(): """ model = Sequential() - model.add(Dense(32, input_shape=SIAMESE_OUTPUT_SHAPE, activation="relu")) - model.add(Dense(8, activation="relu")) + model.add(Dense(64, input_shape=SIAMESE_OUTPUT_SHAPE, activation="relu")) + model.add(Dense(16, activation="relu")) + model.add(Dense(4, activation="relu")) model.add(Dense(1, activation="sigmoid")) model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"]) diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index f118d90169..7a5cec9f2d 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -117,25 +117,20 @@ def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): elapsed = time.time() - start print(f"Siamese Network Training Completed in {elapsed}") -def train_binary_classifier(model, siamese_model, pos_dataset, neg_dataset, epochs): +def train_binary_classifier(model, siamese_model, training_data_positive, training_data_negative): start = time.time() print("Beginning Binary Classifier Training") - for epoch in range(epochs): - epoch_start = time.time() + pos_labels = np.ones(training_data_positive.shape[0]) + neg_labels = np.zeros(training_data_negative.shape[0]) - for pos_batch, neg_batch in zip(pos_dataset, neg_dataset): - transformed_pos = siamese_model(pos_batch, training=False) - transformed_neg = siamese_model(neg_batch, training=False) + pos_embeddings = siamese_model.predict(training_data_positive) + neg_embeddings = siamese_model.predict(training_data_negative) - pos_labels = tf.ones_like(transformed_pos[1]) - neg_labels = tf.zeros_like(transformed_neg[1]) + embeddings = np.concatenate((pos_embeddings, neg_embeddings)) + labels = np.concatenate((pos_labels, neg_labels)) - model.fit(transformed_pos, pos_labels) - model.fit(transformed_neg, neg_labels) - - epoch_elapsed = time.time() - epoch_start - print(f"Epoch {epoch} - training time: {epoch_elapsed}") + model.fit(embeddings, labels, epochs=EPOCHS, batch_size=BATCH_SIZE) elapsed = time.time() - start print(f"Binary Classifier Training Completed in {elapsed}") @@ -158,7 +153,7 @@ def main(): siamese_optimiser = tf.keras.optimizers.Adam(0.05) train_siamese_model(siamese_model, siamese_optimiser, train_data_pos, train_data_neg, EPOCHS) - train_binary_classifier(binary_classifier, siamese_model, train_data_pos, train_data_neg, EPOCHS) + train_binary_classifier(binary_classifier, siamese_model, training_data_positive, training_data_negative) siamese_model.save(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) binary_classifier.save(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) From f770e8b7d0c53618bb636cdc0da73aa2a8336a9c Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Fri, 21 Oct 2022 19:01:11 +1000 Subject: [PATCH 12/16] Minor tweaks and documentation additions --- .../s4640439_siamese_network/dataset.py | 11 ++-- .../s4640439_siamese_network/modules.py | 45 +++++++++++----- .../s4640439_siamese_network/predict.py | 23 ++++---- recognition/s4640439_siamese_network/train.py | 54 ++++++++++++++++--- 4 files changed, 96 insertions(+), 37 deletions(-) diff --git a/recognition/s4640439_siamese_network/dataset.py b/recognition/s4640439_siamese_network/dataset.py index 8fbf82c5df..03e3412b7f 100644 --- a/recognition/s4640439_siamese_network/dataset.py +++ b/recognition/s4640439_siamese_network/dataset.py @@ -13,13 +13,13 @@ NC_TEST_PATH = "E:/ADNI/AD_NC/test/NC/" NC_TRAIN_PATH = "E:/ADNI/AD_NC/train/NC/" +PRE_PROC_DATA_SAVE_LOC = "E:/ADNI/Processed" + # image constants WIDTH = 256 HEIGHT = 240 CHANNELS = 1 -PRE_PROC_DATA_SAVE_LOC = "E:/ADNI/Processed" - def load_data(directory_path: str, prefix: str) -> np.ndarray: """ Processes and saves image data as a numpy array. @@ -67,9 +67,4 @@ def load_data(directory_path: str, prefix: str) -> np.ndarray: print("Loading preprocessed data") data = np.load(save_path) - return data - -#training_data_positive = load_data(AD_TRAIN_PATH, "ad_train") -#training_data_negative = load_data(NC_TRAIN_PATH, "nc_train") -#testing_data_positive = load_data(AD_TEST_PATH, "ad_test") -#testing_data_negative = load_data(NC_TEST_PATH, "nc_test") \ No newline at end of file + return data \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index 26320feed2..0582f6c37c 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -1,6 +1,7 @@ import tensorflow as tf from tensorflow.keras.models import Sequential -from tensorflow.keras.layers import Conv2D, LeakyReLU, Flatten, MaxPool2d, Dense +from tensorflow.keras.layers import Conv2D, LeakyReLU, Flatten, MaxPooling2D +from tensorflow.keras.layers import BatchNormalization, Dropout, Dense IMAGE_SIZE = (240,256,1) ALPHA = 0.2 @@ -19,30 +20,50 @@ def build_siamese(): """ model = Sequential() - model.add(Conv2D(32, kernel_size=3, input_shape=IMAGE_SIZE)) - model.add(LeakyReLU(alpha=ALPHA)) + model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=IMAGE_SIZE, + padding="same")) + model.add(LeakyReLU(alpha=0.2)) - model.add(MaxPool2d(pool_size=(2,2), strides=(1, 1))) + model.add(Dropout(0.25)) + model.add(Conv2D(32, kernel_size=3, strides=2, padding="same")) + model.add(BatchNormalization(momentum=0.8)) + model.add(LeakyReLU(alpha=0.2)) - model.add(Conv2D(64, kernel_size=3)) - model.add(LeakyReLU(alpha=ALPHA)) + model.add(MaxPooling2D((2, 2))) - model.add(MaxPool2d(pool_size=(2,2), strides=(1, 1))) + model.add(Dropout(0.25)) + model.add(Conv2D(64, kernel_size=3, strides=2, padding="same")) + model.add(BatchNormalization(momentum=0.8)) + model.add(LeakyReLU(alpha=0.2)) - model.add(Conv2D(128, kernel_size=3)) - model.add(LeakyReLU(alpha=ALPHA)) + model.add(Dropout(0.25)) + model.add(Conv2D(64, kernel_size=3, strides=2, padding="same")) + model.add(BatchNormalization(momentum=0.8)) + model.add(LeakyReLU(alpha=0.2)) - model.add(MaxPool2d(pool_size=(2,2), strides=(1, 1))) + model.add(MaxPooling2D((2, 2))) + model.add(Dropout(0.25)) + model.add(Conv2D(128, kernel_size=3, strides=2, padding="same")) + model.add(BatchNormalization(momentum=0.8)) + model.add(LeakyReLU(alpha=0.2)) + + model.add(Dropout(0.25)) + model.add(Conv2D(128, kernel_size=3, strides=2, padding="same")) + model.add(BatchNormalization(momentum=0.8)) + model.add(LeakyReLU(alpha=0.2)) + + model.add(Dropout(0.25)) model.add(Flatten()) + model.add(LeakyReLU(alpha=0.2)) return model def build_binary(): """ Generate binary classifier - This model needs to be a binary classifier that takes an output vector from - siamese model and converts it into a single value in the range [0,1] + This model needs to be a binary classifier that takes an output embedding from + siamese model and converts it into a single value in the range [0,1] for classification """ model = Sequential() diff --git a/recognition/s4640439_siamese_network/predict.py b/recognition/s4640439_siamese_network/predict.py index d472818752..4747e323b8 100644 --- a/recognition/s4640439_siamese_network/predict.py +++ b/recognition/s4640439_siamese_network/predict.py @@ -2,7 +2,7 @@ Showing example usage of your trained model Print out any results and / or provide visualisations where applicable """ - +import tensorflow as tf from train import * from dataset import * @@ -12,17 +12,22 @@ Load test data using dataset.py Test the data and print/plot results """ -def load_test_data(): - pass - -def load_classifier(): - pass -def load_siamese(): - pass +TEST_DATA_POSITIVE_LOC = "ad_test" +TEST_DATA_NEGATIVE_LOC = "nc_test" def main(): - pass + # load testing data + test_data_positive = load_data(AD_TRAIN_PATH, TEST_DATA_POSITIVE_LOC) + test_data_negative = load_data(NC_TRAIN_PATH, TEST_DATA_NEGATIVE_LOC) + + # load models + siamese_model = tf.keras.models.load_model(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) + binary_model = tf.keras.models.load_model(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) + + + + results = binary_model.evaluate() if __name__ == "__main__": main() \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 7a5cec9f2d..2b60be658c 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -11,8 +11,8 @@ Make sure to plot the losses and metrics during training. """ -EPOCHS = 40 -BATCH_SIZE = 128 +EPOCHS = 100 +BATCH_SIZE = 64 BUFFER_SIZE = 20000 MARGIN = 0.2 @@ -30,8 +30,7 @@ def siamese_loss(x0, x1, label: int, margin: float) -> float: Vectors of different classes are punished for being close and rewarded for being far away. Parameters: - - x0 -- batch of vectors - - x1 -- batch of vectors + - x0, x1 -- batch of vectors. Shape: (batch size, embedding size) - label -- whether or not the two vectors are from the same class. 1 = yes, 0 = no Returns: @@ -55,7 +54,8 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): - siamese -- the siamese network - siamese_optimiser -- the optimiser which will be used for backprop - images1, images2 -- batch of image data which is either positive or negative - - same_class -- flag representing whether the two sets of images are of the same class + shape: (batch size, width, height, number of channels) + - same_class -- bool flag representing whether the two sets of images are of the same class Returns: - loss value from this the training step @@ -63,6 +63,7 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): """ with tf.GradientTape() as siamese_tape: + # convert images to embeddings x0 = siamese(images1, training=True) x1 = siamese(images2, training=True) label = int(same_class) @@ -77,7 +78,19 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): return loss -def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): +def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs) -> None: + """ + Trains the siamese model. + + Alternates between training images of the same class then images of different classes. + + Parameters: + - model -- the siamese model to train + - optimiser -- the optimiser used for back propogation + - pos_dataset, neg_dataset -- pre-batched tensorflow dataset + - epochs -- number of epochs to train for + """ + start = time.time() print("Beginning Siamese Network Training") @@ -117,16 +130,29 @@ def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs): elapsed = time.time() - start print(f"Siamese Network Training Completed in {elapsed}") -def train_binary_classifier(model, siamese_model, training_data_positive, training_data_negative): +def train_binary_classifier(model, siamese_model, training_data_positive, training_data_negative) -> None: + """ + Trains the binary classifier used to classify the images into one of the two classes. + + Converts raw data to embeddings then fits the model. + + Parameters: + - model -- the binary classification model to train + - siamese_model -- the pre-trained siamese model used to generate embeddings + - training_data_positive, training_data_negative -- raw image data + """ start = time.time() print("Beginning Binary Classifier Training") + # generate labels - 1: positive, 0: negative pos_labels = np.ones(training_data_positive.shape[0]) neg_labels = np.zeros(training_data_negative.shape[0]) + # convert image data to embeddings pos_embeddings = siamese_model.predict(training_data_positive) neg_embeddings = siamese_model.predict(training_data_negative) + # merge positive and negative datasets embeddings = np.concatenate((pos_embeddings, neg_embeddings)) labels = np.concatenate((pos_labels, neg_labels)) @@ -136,11 +162,20 @@ def train_binary_classifier(model, siamese_model, training_data_positive, traini print(f"Binary Classifier Training Completed in {elapsed}") def main(): + """ + Trains the models + + Loads training data using dataset.py + Generates the models using modules.py + Uses functions defined above to train the models + Saves the models for later prediction + """ + # get training data training_data_positive = load_data(AD_TRAIN_PATH, "ad_train") training_data_negative = load_data(NC_TRAIN_PATH, "nc_train") - # convert to tensors + # convert to tensors for siamese training train_data_pos = tf.data.Dataset.from_tensor_slices(training_data_positive ).shuffle(BUFFER_SIZE, reshuffle_each_iteration=True).batch(BATCH_SIZE, drop_remainder=True) train_data_neg = tf.data.Dataset.from_tensor_slices(training_data_negative @@ -150,11 +185,14 @@ def main(): siamese_model = build_siamese() binary_classifier = build_binary() + # create optimiser for siamese model siamese_optimiser = tf.keras.optimizers.Adam(0.05) + # train the models train_siamese_model(siamese_model, siamese_optimiser, train_data_pos, train_data_neg, EPOCHS) train_binary_classifier(binary_classifier, siamese_model, training_data_positive, training_data_negative) + # save the models siamese_model.save(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) binary_classifier.save(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) From fe0d94ed66b9915527670f35a2288e2b92bbe3c3 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Fri, 21 Oct 2022 19:50:51 +1000 Subject: [PATCH 13/16] Various readme and documentation updates - Updated documentation - Added PCA plot --- .../s4640439_siamese_network/README.MD | 30 +++++++++++++++++- .../s4640439_siamese_network/images/PCA.png | Bin 0 -> 31589 bytes 2 files changed, 29 insertions(+), 1 deletion(-) create mode 100644 recognition/s4640439_siamese_network/images/PCA.png diff --git a/recognition/s4640439_siamese_network/README.MD b/recognition/s4640439_siamese_network/README.MD index be89b9d102..600038b505 100644 --- a/recognition/s4640439_siamese_network/README.MD +++ b/recognition/s4640439_siamese_network/README.MD @@ -12,13 +12,41 @@ and testing splits of the data. ## Description and Problem + ## How the Algorithm Works ## Results +Unfortunately, as of the current version, this implementation has failed to construct a suitable classifier. + +All attempts at training a Binary Classifier using my Siamese Model to generate embeddings led to the Classifier getting stuck at a 51% accuracy (the ratio of negative to positive samples). + +I tried many different model structures, as well as tweaked various hyperparameters, but was not able to get the Siamese Model to generate satisfactory embeddings with which to classify with. + +This leads to the Binary Classifier quickly getting stuck in a local minimum, unable to differentiate between the classes just by the embeddings. + +Running principal component analysis revealed that the issue was with the siamese model. Taking the two principal components with the highest variance and plotting them for each embedding resulted in the following scatter plot: +![PCA graph for first two principal components of data embeddings](images/PCA.png) + +As can be seen, jsut considering the two components with the highest variance, there is a major overlap between the two classes. Thus, it is no wonder that the binary classifier was unable to assess the data sufficiently. + +I attempted many different tweaks to my Siamese Model in order to try to improve the embeddings. + +Techniques attempted include: +* Trying various batch sizes in range (32, 128) +* Trying various epochs num in range (30, 100) +* Changing structure of model - size of convolutions, number of convolutions, strides, max padding +* Changing the margin in the loss function in the range (0.1, 0.5) + +If more time were available, I would try to train the Siamese Model using various other loss functions instead, for example Triplet Loss. ## Running the Code ### Dependencies +Requires Python version 3.9 or above (for type hinting) +Requires tensorflow version 2.8.2 or above +Requires numpy version 1.21.3 or above ### Dataset and Pre-processing +Original dataset sourced from: [ADNI dataset for Alzheimer's disease](http://adni.loni.usc.edu/) +Pre-processed dataset (used in this project) available from: [UQ Blackboard](https://cloudstor.aarnet.edu.au/plus/s/L6bbssKhUoUdTSI) -### Example Usage \ No newline at end of file +### Example Usage diff --git a/recognition/s4640439_siamese_network/images/PCA.png b/recognition/s4640439_siamese_network/images/PCA.png new file mode 100644 index 0000000000000000000000000000000000000000..aa33ac7a66db67c2ee6026790b208646688b65ee GIT binary patch literal 31589 zcma&N1yodF+b%qGcQ+ykNC`uy$PkiBmw?0|(hOZgC>;_4LyAZ@NQ#t{!~h~CHHa`s zmvo$szTbDw`oI6I|934H%rLW`XW!3V*L7bz`iZtG2_YjP2m~T|tOn5qfiNjRAdGW- zT;QFSKdeaL2R7`{V|{$!Um(716!4nBP0a)b0ufQ({>S*X5PS)|De0+f?5XDp_4Ik} zVFz-4?&;>_>gnWQ&F*dI0dsJ50Sk%*cuYfoGj$+O(UW3hS~=y3ozU~(9Dc*_oa(Siu- z8ivv}qrb&nut-da$J!2^Y24>n@gUrLhEhghSYg$0@7=NDs_7Zlb}!qzH}{Fsm!t%T z7Lyxe7c-&jW``JD4-P30Q9-$*?x75<1jleM@%CIhUHo>u?x8?L!R<{SU6rAXz^)4q zMcpvn(qV#c7hgbcFlArec$|{E&3zZux&Iw5iO$vOcu=nq5*Z4AP+yP@>%eam!nr15 z6V3jW{fY155%kyvwP7wHT#prcH#86H6DD`YY%a1TUHvlH7We5%nOd|YddCNOr2_>~ zINcEK<#i1e^m}N5Wd??;h+5k(!d*5k5wrF%=72cFf(_Ix%qEYt^|R0VqIi9SV!HQd z#4}syXog81Le37FHz>ISh8}lxjSb%eU%Wi7i9>ADnQxxI`>a6n;|+d* zj;I_z!JO;eb1anqJJ!=4-#K2)jc9TNKUUYFRV`qNC5$H2r>fm9pJ%(6%$S$p;C0TR zt>WKnXaum)|V3MIN}8IWpZP5<91_fPhaJ*;@cZx zo;Jmo#lIBVle2PVC!BYNvUwqUMSjeIDlif-b9-+7$nrJAhlj5dVLO%{^DA1X7CH8z zY?!dHfO!{azciY4Kz3%h3QL~Asrv-<8J8L3QpPO~@p)7#{c6dU4mJIdvYF=QEDzJ% z9?FZJFGDrw;f@w#bXfh0Wka1YJF^iH}s?rI$w%n zoxI36z+KPp_k@Geac=6{9f{G|Y}qh7ivDXyw=D^etvNA#!A6XDhPmmr3rvh1vg4dw znE%hjs+WRj_8nan=&@hm!#&QuL>tLJQK+*Wjs!VT-*$z7NZ5`6IDk0s zi4VDafx^UHsj6uz*&F%(cDgY1*{RTuW}-Curxem8c31s|$;#kC`x0gwDmy>~u<6OZ zgw*j>my|@R&4tQk94=_si8go!O@lHA54jUwb((oN?Rp?TC!twUSoMqMbj-y<*rS5Y zJ8b)1J@7R-8{**#9AC1%RM<{e+_Mu99OHU%*c*oYtQ&Ljbr?vYA2f%zm@kh$bML|T zT^YXo{)9a$L@hh0!L8k%VGsAe5S*3ChE<~_<#X;nd0`op5Oh{|c|$2Xiu7lN1?+!j zX(6r)dVgAV+7`ADDukBe&U(VyqT@yHfMuvt8r zheGJCFckyU2$Gx^3oJu-N?Jqen|KHo?c=yoHXKe@IUfqHmc2MZN>`a557)~rR880C z4$UqewIoWRE5>XvPck%fx(yTJDJ4BjkXn4-X51szBEv zWb78K{dU1jvO>H+XZpj?1+b5Hbf_CpP-kA>ctWkDD1F&Ln#;5esOwEXxKkGFIO2X8 z5QbLmpCd<`3)Qe1+d^S|0XZBWA`R6SWRkWGt%&vn!7`0wp(c7pIz8Ow@~?G}!+(5f zB|C=`?kjz=%hV&kqYLC>*G|Gf`7U89dV7o4I#HJu2o}VO{2|z4F-nL1()_`7&hMqr zoKU7`38io@aV7;E&kO_$F29}a%jZA5P6dx|?4b<|{ksv>%^C7g4naJ|Jw31wJ)>~_ zvyrVa79FOPk*>oq0*m5z?0e^ddm>P1KUx|&oY1LLuO$QaZ6A>GQK`b)i?6Uzjw#B< zPofmAp9qQFhv!@qo(3koYDee5rW3sd(DQuIr)6;Q$(?@agV;y_P&UbWZm5sF{>lN#(wMVAwCR z9N27hG1bp(GiRvG@{q-n4HOOv{-QR!=_8!18G{hOP9U+k5k)J1rVox7cnm#m5nkIF zAT9>n){_=@w4t-bFX0TwguvwtsBVPU$sHgC+F<5uqLs&}*KHiWE15z>_BoB+Y9+m2|6u?G)*X0!H*E z4++A(dEH7BY;1*fUEks!Hq9nd5AWxDdMk1R}Kv9H$5fVtR_6X#&Hku42Tb)&~B zu#h$%@X&GZ1Qa}xTtw#e)59*^Xyxn(wiqhj7^&h%b{rzX&_*`-5aZSCi=I?*70A7} z9vrv*j`{zi&C&kX*!chZqi8QI&U?;TqG>7kLdy$WX3CMoPE82>R>T$5xLPpU7N*`BRQb)`uUo(ocZaZnc=*I#crMXkykTQO>x{*ByC&4SO8mXQ(NvuDrD1GXCi zkL&Yt_T-cywopCmRol(2S8CY0ylR~6)M3Zo>+2znjqaAm19uaDouTtaKivR!L$9#i zNJ9H4RGGURt3UU}3bhX`2JFi11jIxQ3>W1_eWgyk0CBLMU|^r3kB%1?>lV&m(^*(8 z1)Z=2%fYV$Eqs?>*DdHkAVmr|6e+y4hjQdCA9RB|qN3E^J|P!Iw6?1Fv)6G^jx9WE zG#ky5NQ`HaAnjNLEPc2(!Z&vXoWi*t@HtbnAX126KKaLwsoPWB*>0E;^KMv3so;yb zC6Rz*r&T4OH-5!a>mj0CARjG~p2^q&&z3((O04nTHns3urCA%z8#U@UHvoeMbgci<87K4#x(4XdNAr6kW#V{rAq)7^0L1k-p#S(eQMW(lO zn)g_v)g|~krjXsbuAxCuCsF79*3PjK{rLJTOrTpw`2D0&e;vL@51>6EP2o%@S*^H$yyb7^Ohfe;&J=)D0>OaJ0YW34=#d85rz|! zkhmYu+8@^EUB#Upq7M^%+QO-{B}2o>Lu$vs+Wr=+Ogt;DaeoR0d>~pIG`7{`)Hv%F8i5e?Jw+E4-E(PKO(R$UP$Z|8Vh^VV9d2}G?D*%x ztTZVB2%Rc-VO<6gguJ{_8JKz|e{WMimFhpB@;xLQrnm4aIYV$4`2L`dB7{IxObpn> zTGV|aWdSRQlmd<7{~f(@8O8uv4Qu~evi6T~tGLg&bkp1Q_xnN{#{2+81YDsC zri3Pqm1(SYK&!gmJT=)Z|LgBDy%#GGL)9qml`nIYSOlQfM$hrAErB_X#KC9)GnZNe zm(vw|3%_YKp53X;1%fZ>ehT7d?^Gj^>_pa9qlJ3GtaWIL^xqq58YiWe*s{LwLU_GDzs9eh!MSa=Ctz4)LE zf!qTk+Gz0-FU00pCSrBh0{M2gBRTglV36_Q!`FWx872a5VtevdRC_guh2`u3j+Gj_ z&T?c6Em(%;^d7lbeR!x|sB))#=jg~*2r4^pT6R?GM8$1)C=?b#V+Sv|HT@W1)F(ka z_Usg8WbWU(xNEn@>{6z3)ipZfVSnbISqLM(7hZBij}1* z1?(yZrUdl{?~VS#=g5XlV@_i-d(S-ISuaRroybbwu6*{1HYj2>|6hv%EHz#k=*e|+ zjOw7#EEA1WI+0G*xg6n^+nzVGFlSDw{CbBD3rFtDp_waP;QskLLF$B4?LlTZ|z%4 ze*M;VSqVGGfV(7~G1eu)X)VkJAS9fi+}M#pY#ENwg@-aVvXVY>dp0Nn6f4RJJ^!TL z5fisUr2Yoynr?=v**AHy42Z_@Z}lQEVx1t>D^y9C7?*nfw+OHs(hJ05GAB5A+CnDD zHQHi{*KYAkOt15k#+WS-C3P`^G2hyYf;7<|)b_1+W9?oh89kM|4aK?ww}QMN3K#r| zCv#7Z{Dl6r>-QP*-ZE6Bgw8}i)}=ZkhY~+!o0bb~FdMBMsQzuM${;W$usvI_IE#ji zih?iYUq-7`r2q4oWbjcwTfDz{tsL!&f&1U$27dP61$j$mE?s(K@!LrGoP`}y8)p2m zp9835*`hP=GQCh;2(uC#Cs;-}-lIWz^czaP!LLrOn^|zHv z|CMSV$u+r)&-R)VkNTDVwtO@ki)`FL>6K*=GBz)Btx5F##uhNzzq8ym!Df46@=4F$ zYfIkIhgJ4UaYteMx~|FSuK{S-7unF+39uRQn#y(9ztU_MDY&u<_iu@-9MZbwxu`zj+s+;lp5Nm~l(iCGn<8y%ILeKPMM}nGTMX{rqbT1Ox#VGDSCj zlbDg%5fgk_5wZUBMYNf*?BCRs!Wpop43u<_2VX;&=qJ1~H!@4AON(tB@J2{R>|SoK z|81c)!_2?X76|M7a z&tr>P@3o(F?P~eeyh;4KLCUX4pD4i?N*7|Rv4$4M@41nhT3%Xu^5(*~gdGP5*Bid2 zJI@U2oE4cWVP*uAkujv4Na-#9AG9CewoFnL#N4(hRH^0PY^~Xa%?;`#W;HyYrZV#F zI7;ZV{X{L03EHP+%TMw3UB^tbo14I^BYu{C<@64M;;(nfcNC zPjrAIR3orO%jnOI_4xihAKs*wteNy=B=-L&BRlRE0i3lYjVYhay}xBagtVaYtuWOK zNuY$<#>JTY_xmcn6~Oy{uf$MbJw2GQ^Y_H!p88kDnb!fYk^gypYp9_s8dHHG-}JD* zry-nK>S$*>L9GM0j(bKl@cZAEPDC5*Tp8!|*G~W6-0=SdI3+WwQBnHwzVPgsbNXjl zpI+q5sRN6<+4z=aiP_u+bW+(^OMc4020m{(hwvMApv(nn$p8WT!fZ@@awIG1Z$+mO z`mZ437JdU_2R@g3G#8NaS#Q1@@%Z+L_1l2!^QI5B2(9Y&O!eMtQO3MY2_FbOop%^_ zLI=|5)y6&-1|*wtzPl~yWe+#!4Ill<&%Su{=Z@*xM@^4d>|4{Q(EuK)YtzzE_ct`w zIh)iT{=Iytn@t7~<*%5!zsOMnLA8wA@wp}6FUCv~>=087tpBK>}18*GeNF_e&XyMCiIRu-ZXKBoy_NlBrlTSpFP;hBrRDNTGSm zOzRA1qdEkl4wyUDyiEZq#e%MJb~Ec*0XiZeUUYQJ0ds!`PK#^u-+9jAHVXfWAY?VO z_X5m;_>dMIrB=!t^(Of-khDmTXnn#IJ>5UH#Zv-xptO0J`2SBsvte~14F>3sjiIaS zqWbEq1IJ!G^3Eip1by8?;PA>z*t>^chL(qNb#`?E1URbVOBzPCG}<(TFT@oW%}Sh2 z$d9>5W^My*C@_bKxsmS~T5+S|MA|<)@B!(zEVAln9i9-=gkQ((5tDyPFCKsobWQHx zj{=e6wdTM^rP0ZqBc0JE{n_{xgu(4Jl6}McGr{$+swJrt<}ns?pfQysTV8VXB{@ z590O{JLRZ?@xW@MOmfVyFT#MW5;7r(SSXMb^qaz>-$KR7Bw1$gdlDg?bqqF43?s3T*fRiLF!|)Y%_&}4_rW|yV6Kr|r*F2-nP|!PP@*p| z{w)5mWh-Q;q?YkmYe%L)VN`c>dwUXXa_`(}NHx#-EcK#(gb8R0`SxT0;vc;C7-c!) zgn;cN#3}jUCft7-U;#QF@s(4UN!!fOf*Rrtysh-qGZt-<)JZy>Z=S_bAYPlJ6A0^jriXHhe|o^Ff&J`d;EoyaI|rKIjo}95jqb zUphCa|7o|>i4kfAdd$A(3=T+v?GP0@#lc+0hLIRkpoUH_WXm6@xA30Y7}h}$DiS(H z;?#wQgWZ2c?~6%~jr2y6eMh%K-G**@PU>XWVc@f9*v?Pg_Tt~xlPl$Fq+Hyd@ex08 z6J~emj~Dzb|2(_}Bd^1vaza?He|aK}Yemv80CW_i=d_A$kYd z7XSy?YYv{NvNlt0Zo05Qen!u`bLl|UNx7tKtCeOdT_kV-=r{2&Uue8IR4A73VURB7t69HNR(YXzy{PUm}4A5h+fYpMUW$3Ntnv z9)o%E2v`CYnQT%-W<*;*@dsSv>%-OCehNrZ!+x4jOnYl~j8lKg{SQl3gy{esQt%{% z+y`0dgQPJZodr$|JujrbB#L+m)E9csaAg*mUlGmQI!$%XFOF6_^aS5Zut!xUu5OpA zwWQuRkBQQMqh4;zOY_0Yg|B`CN)yyp|Fe2Q;rnQrLdB2g0n=+$%dAmOWp2Dc=u<<~ zEXI|ezI1#!PoDcofZ4oPxPd&_D?$HDqQhY)V&T2OsKcYumk;yW|j({CY zYv()C0DK1|IDTiKqSgY%6bGO%R(ipCYhi0Adb=-6?1*h>l?RY8H9W!Ph&{JNNQwN1 z+uNflDY4dodQ;SVlbTMkSz7EY*8lCExv}*@it@Sg{odDDyzM>^xtqZ6pi;_sk6(i@ zpi-&;#12sQ))&8N4Q125k1Zf%2Ph}aS{kN~xN5vwCGt?q<-?(nu&CenSR>VX6tLjN zAj_Z=5DD*kIS#WG!|2Kzq0#)mM2H}h>kvTWQBpHOK6KpoK#;X%+UMz)`!dU4Im~SE z<@GbM*HQnBXXrPcI+WxYRkZaiL*0K{Hl~Xz7Opcb!fJPf&f>&og*0KFq!s&#{Py=p z9ZftR$=nw2r~T4nh@7ziI?VXV+iz0eezW3Iy#FWjzv%tt9j;}=y&nDo2rY$MI{5?XSSlf`fdU~ zJ8%3SeR%WU13+ha{60#yF*(8K-a;!RH5=s3hoI9#C$$7GaTUXFb}8kB?e4TdZ$cl~ z*sS}51H7PbCCw4XZ z_vMNrNLlNW1#=j5;5(nSlLG%s3*d14axu1Hw2B2TVmN*yxmByLS^QpJTL)g6}56h?pxL>_ONzp5waJRP8r_VWvRKL zF*Hl?x*wP)By-$hbU=tL;bB&{{R|+8VM&ha4Se!O!&NA7KXz=1tR6}--9Vp4D$YTX zC;Ovf->g(n&!kHX;7%_xkFziISCrqKHh_0<7fp~61Jp4pk&$4Px7EL9_IBp+z8?W3 z;w9BqOYyE4y|WsJp3$`0gEt!R4mgq&XlQih=_yJd{U)qRXAU43B7G+A@HNhKNJT}@ zNvCJBvjJ*Ic1i3jZ~;QkQx!U8Xufxw@i|fDXj^TltZ-L43%J{-xAvo}Qusx1P!>FA z$AG6XoxPygcAk=8ed@Gwz}P{6iTA?{`_E9+*-)pl$;CHIJ}%Sb$3+k`rhRmt^{N3; zSG;bP2+Koi2F+KvRx3_<9~sPuWwwd=$X>5Y<8$vT_Ldo-nCJb96@Fu}g-mE4VrYwQ z95UYHpbkr8kPj}O6%f$|_Ito99_6cJVs?k-&hh0l;wi>d$wRJj6rV-zNw6A*rE`Q3 z8`*ZhQvb^rb8q&Pn8Fz(Eb9C4NPQLw`dK9A^+d8C%#MM++V01flvqTjvpw>vB+5*Y;z$q0 zki7O~d&2CK&2|vgds)+Sc~%qHqEj(kScSyn%6~(~E-ejp(Y4E;6;PL?Nh078ucZWm z&8(p}e+3C)R&zNX{~FAq6wi$Z4JKNZ~TB0IOrZ(&FiIMkCK zm1lDKByb}Y9{$9V44u8w!D)YBIrkdc#Ivjqx%Y0yS}!1Tb~ogvt9q-2ct&96>M?3* za6%TL`Dy2)qk|wdHgSG1{>hieI;VHy%_-x@HU@3xD7@WkHyverfaKm4}L>TCkH~llh>(& zUXdTY3i@WJxOASj`*ffJ(h_{N4i-O3KOWdhGf#E9&q4LX$$Ir7bRrwJ08Nu4vY~oV z{c$9@v<5Y9zb@o2_7*dH_jxlK)Lbk!lg16Ven;zKPOK2f#C$?*?jy66!Z)(qf)3P* zF1okAd1x6N9)7p_aNP38Yx}0*qgQ$qh&Q#rAisuYmq=y|!x{&uqfh$J&V@RV7nWy< z!qf19uU0kQztH1Vh%ziAEJstT#WS}N8A9f4QhG`POWU$1fF4SNNo6DC)t9>f${8E* z<}!!@faiq|x;kVu%S5%W@yKZ4cjpzC^IOeK08lUoe5pMm`;+*2-mCSIT2)^EUCG@o z{J?dBnevPux>{Q8?U2@&cDR=59FM0LHywgJxqxEY9S{B}_{ePTxZcdvG?46eSb+$! z4v!)V(Y9jQTRN?d)j`a1pej2lRvmIv^#G`m%#?l#vTo}$_AN6WN_@TPD)h6nU0wle z<%eTaIn#Jcrw*IeEGRm%`u{|a4+y|w_iJnP{JmqN)S{`yvX`^X@Y+~>u}`rdUea(qW*^_cR zO*y)E_&ts1Z7si8TnNW=4IpjEL7c-<;JsRT|NJ30%McMLj82$e_&2 zWpf=*D~U!ys7>KBsp>TIi-lxHlSr%%foBvz)UYd_q*cr^7Bc(u6xF0eP(y4pd8DHr z)Tit#(I18uZCxQ-zG%z{BCr8#BQNv7hk{FT4&TnG6~R?IJlXnE*}EZRw2!V3n{E#` zX*&drZmHUF5pJ7CmD1_dC(pSl5Q;IC#%DCB{!X)y_|9`?vQm-dz6SxXjx#o(PgbmM zM%;d|p?Zf<6_JN*pPscVk#lk~UTqHOt~YFv0d+GQ)TjJxFn%;*3 zoQIcF_ZXKOam@VA2dDwU$*=5>gLJF(|8A+|L)75EPyRS}T>3WhRfP}<@tbShnI=Tc zeoQWj($wokBAXd3&rOREq4~P}X~K%QT#FDJ1^9i?A)^6V}+9eGI*a_M3+yJMUC6&g~~Bsr7< z^+X=U3BCwL1brXom`(H$*B!J+Y8MFPOZla-#g20O?noJKA5nkZGB8!KA+EkkqMuQp z|3gvBHh!UFNnbkg;759J(e?E>)g2<>(z53z_myG>J{oMfshp42UhQdw+QT}HZ_vnU)StP-uVuIv5(pmh(4f-HsA&CS7wEDd?B;EWC@Yf z0!;q5Q(jzDl$MaVq?Kp8{_IX?G`aC#}Z z9bDQpZ%{RjU5>G#_QEfZkeXUr!F>@P%n(Qf4HG1rOcvxMS|1Ra9wHaAvVRjG-^yQ^ zQa5Lh4w76&={2uLn*r0zx!EcF#NqHR4vmb$5QE6h^n^cpV&H0UW|!&AMe)h_W!KZG zF5h#98OQ3MjGP=ISkz|RKgk~Qxe3(B?8yg`@8>-%hwbdwH9Q$C1GE553Vs**5_gLO z{fJsP8JCZ-7y?S2M3Q;ag+?tYKQ+?sv^s9l(cSuA)jptf!!fq#YaHN=1;Bnq!W+$1 z-dXeKFT-|E^b^LK96&o}_@(aH6k-pL#yOjd530kHTn!^q4gPyN^1tN@lOvQ#of(A6 zh*`l-Yb)1xc*eWg0hP=oDF?BYiAf$Qx<{>?x?)chGGwCG%WdLdN*HK+FVBjIwR3W2 zU-k+!iDMH6UD_Nb>Dz61@B+jkKt%kQ{x|`o@kSg|3WE=n)Mid)$l-PZC(Phz%Qu@w zo%g_>z5{gMj0$fGAf4dEC9HU8@_Yj~|4Pvka=*Sre5&RDHZ`885)q(G16BZpLotRD*gh;MB6Wd);pvDdeikmoH^ndHoST8w z$|S+3*410h4x5Gw;FT$}tIZ?o(Id(wh4Uo%hu4qB7I(rE#JwuhP4g}4{>n85btp=) zdvPgvt%7ocHt8o?V|01DJ@Kk#9aZDC?Qpu@LEr;w6#MRPy0Tw0HRjMjp1w9hxZR87?f)&W?aG^u~eqhgFK7KUr0k5ifjd15<;!ohM@5cl-YbT`fJ`w|oSepx() z_DP?l5gGUerv7gl*$D%|NV?x;<>8^Vwl8!SOY?Ga+KcbLJvc!qD)IcbsQW8`Q0o6i zn#uJ2@UAWakWboaWJj?UlWpM8s#jRl5d9>adb+WOLdUW!;qD}vCp;d6_9blE=P6{Q zhO8U?KRq5VQ;9~>ZI@NP?(4g3;P6`lb2fA+!yDXm~e>9?(=g*+B^`GoYdOy zGZno*Cb@2J?ODOf@czZZ&wzU#IF$LpFJ{B?`dX}U0AP2MJbHOB%1wQM(Gx?{u4VwO zaa;7e{xem1w?DCgF27u360H2;XMBV8YAArLinfA~EeX0}>i3MAg*7iIwr@uf#jU(6 zh6?ygrdY*txk)P^$h6|ce#e>kL#VYqMHB69OaNxt`(15yxt zdY8^%lsbvy7XCw3ECP#P%BXdJiroPBXbU)VV z)LZ(>^6*20b4YRfc8b7K%eaEVpX~}*@QKW6H=Wg@tODPA`qvT62fU4RcbJcr_?BxF zxPx5$FGO|9sFOCHY`OKBg`{M|9BGGbYrs-(Epemw-Y<2mJnt z)bGKx-?>p<0iEy;F4bLS>B#(Fxx)C@k#FD+kEz`w$>D$S`IL{cITl z_}8)>Jy+DqP_R)R-tN!#FvM~kRzrKetlxYY#gdpPaS9M)lF8=K>{c;v#!PlxzbPV0e)Fm5r_d`vLISt2yIM6j0^^+)+gx(1M-*b z^2=}2eh8uPtMswsESp%>YO30B>qA;X5~=wBp|(>Fb3ZPA%bmOc|DvEI9twm>3=7g1 z$CtG^zf)MfqQZL9!a`N#?dJ>#4x`;U`oL!SeC}#=AGi6BK*4l$l*Fl3fn!1*C1a2` z?XtDKLY(-o2Ustq0HL6P6S@1{JwcSnJ0_mFrg+3>wq{DU)Cx~z<-m`nC^G0GsT1yd zWD7bx$PzORL#Y!zA;o-F4!Uk0HeHQ%?WHfDn1M-D z3W#-aoA`@RKaWVl$0V`yg4%M+6H8uyPtbrDQ-1EbMhB){i4$mP?Um}Bq1)RT8pzY4 z)b0T5-e3n#4qv}vmF7wD>+Fl0&I<8alL6-AFQC;}u<-4U5osTby~7vN`~$4Cc#r^gZG)%o&Z z0BXJ8BU?7`$-ro9$-BTPK*cMv5UjxW`mg54On&v+DVBs|3MbJc0AogC<0_Eb+(8*+-XjSEZe*^|x~?5?^vjhUO(H^zarbN)_rYC0|=dW63!b)4Uevjh^t) z_M*D?>^+6}HZhCDB>9=@T>~@Ku;53cx)VzsSFMa^uO6YYh2F|7{#|A!BOOzszNM4d zDi-Df<7J}yC7F#?(WTShhLmbr92Uf7bw@5>1hSnzeLCd33 zWzoSbr~bmXt1+D(V`rKFk$3CFGd>(1Hl?*If!a2|mx|7Njj>mL6PEuJy7}snJmR1} z?~fuxu#~#O6em8Oag995p}gdy2lxU%n0L=UR0lH&b!LKEduI`@Uphf56>stfT<&r9 zRmVxeoWeEaZDcpj?&9EJ@8h2>Q{P{_2;tIDQ8!k)?rowZP^fUt2qcz$TX)4f(q<_(k|4QjfZz1~t_TVNyDSfW` z0C8ZzmpD>8XfHw#$t)tmWb@HLHs$dRhy)fj@KJvY`-GN#i88SX=b|v?b_moWB&+&t zBxAz%UV4GmoznNlPfF{a$CU?0JPLM>`}nBbcTI1LFdxt(zMHwhYR6_yvtN4N61a^Z7+)tBq8k< zn?605M(5z!=d3otSD1o}`iit3z~wK`r`?7Gu1}q2KLcWRg9?CparEH)7QNE{8)6gJ>nAt9G}pa?K!SJ$(Ui#yrnYLgWvPs}zW6;kziIVO5C z>ejJ*_cMp3V{M!^tKHc!Fy*QoW<+aLTxyp$1w;N-Q2aWw*T>&?S7G3ua4{Yk8_O|pB6$zmjGXbtZB^BAG&dhNa+{=3R!$!NST^n_& zE1_AW0tm!Yha@cXyW_n!in|`N_=Ojy+KnIw9u%3ds22fPziiB{FCgTYqspCXukxFy zYLV&mnjj^XUktZFE9FTxCu1G3(0V2~!i2>}vncL=jCae-8>94df#I$Tb(8HeQRhyf zLxN^Qko~R(iP(pO>zrqI4!8o-9#DJ~$2nZOgrjDspBLv?jdRTSSe8^5|4F}5Tfsa$G)ZqUl$INv9rV9J?UPIx)Z(%yXm~6G+$g% zqI=Qt67D8bb!?)y^zf*ZT2d8l@tC8*D7kg$#EF`a@Y@;BzHOkdS*sDxKY6>T1Wz^4 z1*zs|M_Gb^G}`yOt5(+;;2(qb!-%3R189-}GWnJIKC%BqbTSsfjh`~d*qnwha?vyJ zr#&F=d9@MLFbDt}w;(V}t!ZWxDJ8d5av;Nz@m~s9@!+M$#c1mjdG*1+C-M)3sY~T6?vw5S`3$Ae77#}pO`*+cB}=w_Tw^c%c|TB_ zED^3_aqIJxa_<2=Je_ToE;s*A2h*jgmTX_7mR86K6$38Zy1B+n;3SY~8lKXeyF38T z2e?-4m#sL)@_ZYI$mII~`iLvtYS5%IYaD9k3# z;d*0c1JE4~n#YPg#MDb-J;{Xt+58yI@C7xe`tZJ`(5o&JvaUPu(6_~n-@NuLK3*TC z^ZPPrCqOF^ib5Hk_p!*xA$>XT+)p@==&LJ{VUym-j-D}ui$Z2Hb^Wzh1h(&u??yuMK4_~-M z*vXT>C~9u`#=$XJWMy5bsq;(#rglH@NQ7>2uy%9fOSm){(q7HQc>o?*)~7$D2!T@L zP=KFoF2=ra1M|mvh>|jf5%B#Th?$TjI%`$G+B6ctqd+`8xR!eft?%==gPxy?mQcN? z1c>$ycO+vum4Jto2A{iqKYW4?MD6)JYZvb^erYc{DPEcu9yQGKTDl>CVUErEe4d3u&=jDQ4z-JWN`3!CFJhqhF0*giXP>8jON2_rj3=T%zO$m)xnC^hB#{_pb;%AqFl9frz;$b=s*fvWBsVHF;c z{@8w3>eBtekufAfQ9XTi<(;r`4Zj@M$Cbu=nP#L&xVCIAd|8l!kuEFgzq9~J_w|9{ zhyun9X$1(0UFD7hPU1PrKBUOUFi9h!?lLjG_vK675)rErxYDtJ3+Qy z?%HFit@=ZGbC#AL9T|5HU77b{U%Q>We6hpE%P(e;A1wJ>$L(uS&DL-7k@&7|vk|GO zo~ZsH5C!mb7Um_){~}hXEb-h)@^{@Y!R|d4DfP5Tq|6qH^^4;eHbZWnx+y()HUvK$7 zMX9+m4#3neQ67m)qAjqrMe9}GjRQ|nFwAk=kps8|ke2V9Tj64pgXqm+k~macj=FCK zQ@E(r&v~<1beOv>B$lL-A2P&_pABO{yw3?rTNx>_ZspN0EGM)9vI>w#pV`dK!7L;e zLAAYOVtA%a4?S64rJRNon8o@Vl8~Qm8U7^2PZp!XYVW`E zpUN8<{$Z$MmLVsAj9iLKDzF0>C$N#Rl;=Cc2GFK}cCn_GIx7Z5hv`a&WD>1k!m{*1 zE*KOYKV|?ly4IU0(cHH(xsb=GZp$-OrzD+j=G<1qu(V`OKd6WFKv7i?&<8gPu1$Y? z;dm%TT1)TgL$w5#PwEilmVe-}f8@`SR5*4hS0J_I`Sfo{l0S#$;Z|YnV6}tVh_^5T z@teO4BR!Bo)lQYE$u9u~SI#7$n+GTfpOYHSTkXV<~GCXBeKgMfF=Urv#(3qK?8c-$ToA|*iWJ@9DVUX5xa1j}<2(Rb*q zZ2N;pJSv0e(kC~l2?M4+!OUBZ)XtDd-1DPz>yIyG(0f5RZ~-_tFnaBW&#t##-bDS? zD9sDbajlIN9Tib$N8tf)KIOG|W0{Be zs+k$C9;?fv|0beJfIo`aHhz#Hvn-1Fed_0NfuTV~I@6u3^OXP9*jq%hLIv^pSNVft5(jgr~NeCiHcSv_giZDvUFrc(_j&w7?0CNxT`~Ahb|J=KlYq6H+ z%sG4S^X$E!+Mg%zIP|57X}L76T@|ro$(l~q4d((h_K?78=SC&y3jaOQ<&)9E1)s?lg^1PL3)&Lhjj-`^Tzi?#mD zO=!3n2x1a%kSYP$bQ*U-r=NEX_>HbqHKh~9+SjBM;jrL}@s=bJc8 ze!(NG%2Iy{?QFI=w~}6}!_OU?1Yh^DC*H}d*+^KmWn8-lFFVF{(2Hl}bDn#5J=Xs; z!T8Qcoda2a5r4MK(pXX@w0f2ol-!#|Ta z(|U?K`C+${vw)@`@#14lF~5Y(rm*HB1#iRl1PUw0;h|*(wpM61Td>+On*f?3IM`uI zF)*6{L~vs7BlHV(f%HB)1#_sl@r>!r4)@dZIDjC~ZyKmtEcQUIL9f5j;Q?Hc?x_E+ zOfWLgUN;8=YgYbp3%yr+5+$6q_4A3F{~WK^_NYg0N8tDNpv6s)Z!0;^=3-;}hvJc? z?j+5p1{(i;+Qwe2NYhNHIKW{MZ^{oA%h@ux~6W!#oN!ir5eIJ$y*=tshg zCJuB{`a;(~D_1mb{)?5&Y^2Dg^jo!X-{3#Wj`*nDl^!|s)k8;mU#kF~-}$%xEorlt zQxil)-o5o!*>7OZQf^O7BOi37Ilw;|>DXBM%xH;-a){HOVl@Y4`uDajLDQ*N;zn=M zC0X7p;Z|3FO0oW?ZDC42EPObU``FR7SPv}iM(8=UE&JyqqkwW{G(-{kVtT1HqXpg_ zf2|n<(@2ptn8PM`dH>j&0$ED1Phuu>?t1&zQD=hXRoh7k>1S*zJuETh=4#4|>@fyU zbn~v*lLHEb@C^Kk@_p1gZ>V|C-F!Su#$L%H4@sfybiSUOUyZyb>Ri``^DTzHl}tD+ z&dEjkzREe%NU6yCQk(Y!6Qo^^$0P7AE#`&u2c7Cp>^6}BFS->P#23xY^di@#SI<4e zAEz_CF!;6m;h{vi=TQmwn;ELhJy!M<8}#3dbg1KDl9x>LWg zUu_578P~&s@8Co1zH@pMTeF1kZ@Kl!cCv4^#ao2;xf=*W7<+4p6q7+GLrYM$|Fh=} znc7=UkOZw=pE!RTxJR$#Ds5c&`IlYs}#6D+h{(hiy<;2*7K)9ZCV1O_rT(YuP1EMe!*P0&;xRNWQ!$ z90WrOx9(|YO8<;n_i9>`gcW_n9u#kOngHIO$*JERO{wp+${KE@N45*;;vxyDH$JW$ zce<}oRXR56>hEb>)L)YeKFhM5Ym`Nbs9%d{09_Am>#Ae&zkWR2?BbN_rwJm`bNIW9 z8r|bqQCnfU5DEySZyT3Is%&TRJSwsCB32w?vI?^k#uvf}W07g~cjFuj+HRPLW^XJ> zOyKNP-?+#UG)G-&86$tKO5CoHP~ln7a}InrCs}JKn$T9hTU%$^Zxkxpak2_ODX%4e94*i*3C1v19k!p%@4^^QGpk=M%`2<3H$?ypfJx_fH_(a5?{ zQR`(aqW7vcuMXw|Sf6(b2U5JEt^`0b)?J{=Grwzv=%-sRVq#P%pWuYDiZ~0tGPBJP z8A@iJo#T`fc$P@SnKz9mM7GHyDopX zBL68Up-ji&c{(kfT_0$KolEyKge+@D0i45y;F%d@WGgM_lnH}M=;ujUIx>4Ku?lK< zx(BAMwn{u#&{W{$*d`+5fbC(dtV6?8y!$8Mvc9A0QZGIx%Vu}U>5nQS@5HBmI(ASm zZn`ebCfXF)89g*)!Hl)Oz`#O`<@${TcCAiXI@-)dD^)(55`Z|${*3urMTw>1i6`6_ zwCgs%I4%4}zw~P8_wCdP@+jU+R#nxC+Q)!pK z6Z|m?F)s5X>EgrwH)qE;sJ#8#`_V0a@R!;H`-tF37O{}`(O7mPFkq_ar1Fks zJ4J6xm7fMs8!9w~Cw}ma{y|hsEm`JJYY7?42pM5PyqY{b??0CCKbH>$(Pq|IaD{l1nfhm`Qby)RJmHC5~ zYD^rzQ;8JR6Hrvq_-ec|#&Zlb;SFvKxLqT2^7&5tUfam|?k_VIz7je!WItZ7=K9bN3}wg;x!b9LR?WagVU-YJJ^WTEE*4x;JMm5N-=8Y^r*^~~NIn4lW^Kk#?u=U~y zT|JSl4XcJpKQwBB+ODWs&S=JTQD%VRq)GXBPY5LnF4d(w{vaSdmXwCd||fh#OQNyc)~$IyilYRF>we347;PrL1R(l`z)) z*{gO2w)BfJAp=h#$zx9f9n#{ZgSqs+w(d|S7?M;pMFsA+=^nc_?_m+W!wQPE0&Rcn8u?m@ye}*XyZx#@dt@=5(*EJZvYoSn?@B8giMP^U z(I@O7hM&9~we9xgFx~o*inrv(CF9<|fB#S@g%*Ty8r(dqS(?Kdlk6j_z9?kG=E-$) zYEp@lnOsR%O9UR^Sa1~&BppUQw3ty5^O2fL~*3kNtkw(FxnC<4Pds2 zuYOR@SrP;18*-t!7z0dR@$$#L)Nvs;B@~}jGdpY@;X9us3mL-0*mmKTqpw(%qSKH* z<6?{y^PZ9FxzF&h7K1g`chhG`&rzW-sEaC!slIUqY)0-d;0;X zhH;c(j4R`1_wVBI$1s82!-fug%Ql(|U2mSr=>UHo&l^(Lg|?*0WqUw<`}^nJZCYIE zWb|%yr}_IL&F&Vl#`i8-Ws@^4;}D^4;-Dc|rRFr%*lU%DOy;-XH=VIx$nEw{DcqAF zX!8{Wz+`bl$7ibE7|@YI-8^$Al(#(4OqYnD`p83<-l^w1CXm&DonPLdaLdJQcaay5 zAHG?lJiWZgaSN1uXzHssle=*c0G z; z?O?v|9oo(m5PcmJgNfAY$ariwp;56V{%&-?~9Q5PfU^+dbqGx<_^lEq7J z*)Fr^ixIc4-;t|v{!lGWAp$ZyF!f>~%3n79qmf|>&hU0`Xh*yYI?bn>QAT*6VHu~( zWKi~YUwEZv>EQ$({~VLz85=wY5zIEk!z1!=duM;`nE5mE_F%fDhVub|UCmi$b;9_` zL?2AK@Mt)Xmv38;qeRc4fB4zG>6yD*rl@Nl*lxx?SA8|X&_#_V^Iq+^sp2({-_I5P z($7&DyF9k7kG} zse#L?e;B#SnRHEfbO`+%>H_w;=4sYeQyJN!?IZtM6GNPrS6Aro)6~i&9p}dZpa0=T z{i(vbHby@oNRd){)o$mO;AKR*&b6-}zoc!vL*RwS+ab}{WW|l~PtQ`9dJv3mN7kFN!GKu*`YG~+IQ)qMz5Uhek<6C&41hNQWv~cqfr(piWWK}{ zq~!hnGd$)Td||qSw|CMCWF9X?pK+FQ{E(bXV(&FbGmNes3NsS}v+-pu> zr-^t!ZGnM9XQ2>(ye`k3@k;e!%VXP|=T?3XLeIH-Bw53BT{sUP!yaqop?6Q6{cW@k zg3Z4)kJXH3a&x@aE?3i1QZ3#)NyP%NQ`1v3Jv&qG{q&OqtNqX4tkUz?gOac`2(o_L z&@YnWS-dem4JnU(V)Pg(i^rJvZ*RL-#IeeenCNkMm~}T{CcsfEko8?(gy&L_QIqy~ za~EP1sTW#0EHOX@eCH;Lp9EzXymERfd9Uxm6)3k+6`^S{`b6HYpM70iC_YbkN}Q!M z+#ro<-itKztLH3ei3B{QVA$%NTCx?UWA>(euA}&+G*$bt#TP*J2VN{HE~LEW344M2 z4OCpj6Z7zpfq{*pz{T@RM##J>YV!1n59G!+yiQ5;*Ji?lfN1)HU~4q_nZc?4lr|+j z|5=0BlB9GGXZm>k0OP!_BSw|=lX!e^G(p$#vOjiAl1O`i=W?*Y;)A3HAT;b}WJQkJ zA2^e?Vz0QOP}NvbClO=!EQaOE`s!Og2iZR6jd_S))W9Q{Z3d`3YCS1ei4Sm-zeY{>-<*W%Kv%>wc?yusy zWY|M*9*700g;Unv{%bCQuu~>cy`&J0YIb%V%H(SQS{AVj@%X{#Pd30qf;LAs>`oH1 zbJ(PX$B0(D_?j*4MYSsU6Eh@WZ@t4@tw?HmRHsz>l|}9#L|EPU9Y_z zY?kHT!WB?C6{Y*_iL%p_Feudj^cvJJhD&K<9$cIkjZl0M+?My3&uzA-rLx7UWT0he zJTq`7GCsTjm+%rDcK_DP0^8T?P;1p}hEj=_)a9l8a~4}^Ixmu}Bbkj33WUtdewlKJ z)dCX4EfyyQn6LNNM9ZQ}@6tg5|8EPAQ_OxE{L{M-7Ub==($Ycj_vP0=RTB3DTP9?F zy!y}#U=&|byb8ms`XHJC)1Otj!1%Q}E7T``xf5%(`O}4*%&)SlEDjJ{z_(*G77-KA zPsG9Zg1uSfj~yP9ws_XH-WzgF%-$H$c(YZ7DOThw2Ft>;ZGURJ6_ffs3qojL*e;KGw*XczM&@SD1RIba2#=e=pTsoOyBA zP$_G2&849L<=5}6P{I_NG{1KORfJO_Uc<018nH}Z2Bnfe1@$wZgHq(YLW&o9uEA@DLW)~@fO;Y8J2eEX(bi30xm> zq~*59BTAPB56M(8d);N*c8rd;wCmTpS1!o8wh5k<B!TWf+n^At&^Vt4l4L$u!}|oxL%zLW|%BT>+8Jx z3AOg(xxO&KNXm}L{Ek#|p2co%xCsMwz#?DwanVv!R!C&s_mg$&N}Tf_HGwue2xN)(6m6vAn1Xw5mY-NKZMh{tqXG+S6CRbZ z$XeGvgJRvLZEt0NEqXA=A!f)E!;^eQe)?jL$b*!Ni3DLjnxs`}RQ%44ddOHhce+Ws zs)WmUYl=Qh5EU?@Fu%lHuzH&Q$9%KAte&mG-$$HyT6i$PwPcX?en$e~^0f%5Sr{xP zPVhL(wwxE6nZZ7lNs z%zU*BNx;zJ=Lgwx+^?)Bg4kkI-7Xjy4Egsp0+@dvpQO>MU*rc;lJ?~ITl(ID4q^$o zqkM7<4PypDMXT{dl6>o%E-vHkzJre$qWH54svI{g-&TJtyBp(aOpE#NZEQv1sh*t* z3~_%vq5R)+DT`OnaZo-`dw7H&Mb35E#W1BxWW*biM-K>o12Hdc15%Cj4eIFNaD!~a zibNnJ01z&?67~BApXWh(wZt|HkdZh^ES*I8)P9#m#>b!>K-#URyPEuh2G*;W8nX&; zuZ~MuPgh#-4b-(jTsXLt)F3`=umUA_TFiN%Q0Z!1(UjqonA5UJjxs8!))ZpU?UdEE z;5AP(Z#>^(Skk3xYpzEjib=wWUAmE;g{pU)sLYA|8Sq7ssc|o{i=906nl|ZJ?dvkN zN&~%!U7g8mFyOEoY1ibkdRq@@(QjijG=t*?+e-3c2n!NP07_<<35$?;y;Jor!UKbHD#D!tf2ze2XROcRyyNv~fUB)1~j%-)Lpr%O6ddS#}8fEdU# zMwP8Nzcx>ZECBIDmHDgQGKbg_pd}3a^=bV*HE#fTyBHBmfv(d-saNHbtugWTS;n#n zk}uw|*?y$#xyjSI1v%MIR=E6KIS!^O@EHjA$V#vLt>glHuKGK%52N+n`McFc zzDOHA>nh3Q&ofCn{qIEyB94A`nuX!RsARW)(tC+cAB>*G?* za5vT|oGGOd*=2HiBY?OW0?6It%vo2ae1Q5*pjTB9D>9;!pjw? zdr$|(TX`Wq&*}#-?+Fn(*dd8NLDQ!|PMsHA5%(C7MA>Ye29fGE@?iN73z`px(amfjp z2mE4N7f;9JU4dZ5qr8~(@ANdh>WKrELGjdN3Srmkilna@VsIYyQZ-bgvQ6MqwD*G{ zHvA+qQ<3P!=*gtfA66MePG);j$ZcWX3`_`92y+$ySkJb3tgB1A^a5F70$TP$HUBvC z@YTpRwt9voKAy={w?S}8lNtT2rciCGX zARi`=nIOCM(047CdrP?>9x!4i(sV|$JR0;Cf@Vj}Z_WJ?jnmP)KsX+LZB>mZwOUN^ z&IjTK_j(oeXC;sKjM5~G(yql{&`JYh7W038Hw)(KSf%9pWStf~ja`;K zr*s!NJP~$&=iavr+k19b8IMl&31UYuOE+rY5NCaKx6|Z|>bUqiqi2*FELLo278A2! zq~XOui;c{h3tAb2USWWJNRDt33w`i*Meqgb3NJdNP9AY_i4?{@z~aIrN3{kaa)}g3 zJF(TX=2D>oh=pZMl$4gR`1)FxB+BzV%A@(C_pDr?+}z1E?arffg`m|QY)@4jfr2So z4vvWO47Ojgt&B39BNt_1aT6>M*2|EDTQcFb&bo03fvx#a&ZTsPBNZ^^JL3fzxbWkT zijp^V1rdA5{9NhLW}vx-q5uhQ+HSzIzib)(j#jZF2G|}VEttRKDLdrEalVzPaV`pe z^ZmYSBu_X;V-u;eO{G)6D|ypdDR1xU<5mtY=KOkGpoBLv^TeBc!$e*5aYZ}8Nz?;i zl+-Uz_e8}^#(&~w7X%ZUUpxp}wIK%?%ojHMr?ZG2JdjZ4H%phAbaWDsH3sC{?GbHr5Pp&^d^9V zs{mLJfX!qqC)Mkp#+e^$M}_szf8Qa0Pu(x8+)>9A{C!p*Bh;~`_t7OqP0zr!MdgH3 zrZjA zyEekSL8S`!{gN`O`T-(h4kjF?obQCwqi1hp;RLqk-wJ;j0HdtBqu@cyL={)ZYN8$kyvOKg0LNrlx*htRg2)C&!57faC$6g`M-a?;{I0 zqx_vA=ny8qbr*`b(?9ao*MZhERLc0EV2rHnIL(t3HyTyZ9L5iXtN<|?;I)Qb2N<`K zWo5BtM#6Q+4K6<9NePq zMu3F=L;FB^38_Co_^$xBi4{?B^|^lUO9V(*$xJ)OO+1Z}MONM}uV#oSOQ1bLGDNX-QZ4=WEBQ zrvasde8jJEN$T&?Crr+*W3J{MJ6B`>{{5lI=#Km;^%FMT=(ft-M6=mjnsXb^F z8Vi>3g8~AdO5MOEN?K?s=^04v)t{WyngW#s($cEwxlT1|8e_Pd@F{qYL7mI>YC!B@ zm(lFrMD<@2);-cky;cP>pt(||pCntECpU8wG>9Ow&YiFr{$ewm4a z@8z==Rf0iS1TP((D3jj4lX*5=)bc-)-%6xy>w>#R`VOT|;?&diS@i@x&y) zN`J2bzx+*q_BfD8sT6SRNH2xhfKHxm6irTZaP$Ecw7$9`=fB4tJRvX+9hyph{#Y3# zR5fN*154rHPz4aKUk^qPq~rcBi%gH24eTLd2WHG-e;l7m-fY%%r)(?$GLafiK<^!E zODQQ6noOn?3(Sf~X{;RCwM#-QlazVSv~-K-V(bhhp_{}Wo0M`~q+%13@kF;0tyg<+C%YE^JClpjUROl_U(GRNu`l{6etaC;%}_ zT-LhSFMzKB5S?K@hB(h#h}D>2cNZ>@U?k!9kxTPqObUQ8=FOhYtpSw6=mfuk#DqFB z1E-?m770()KVxBFqLX0aXRlyW$n=bv>-20Xzz@;RniGWvdoR8%G48kf2y?5Z!((CaPcollp@g65h^WsTle1|cI)V~vW zhdc93$l2dlj!D3N{^4FcldK;fp1=V&LC-16^a4Kcj51l^!*S_m<#_qqdzYJTR|9Sz zf^$F?h>b(NPp4n8)*kLAsS7SI9DQbd0N^j%8o(S!ssFcr%H#jmPf`C5$9ORDzr0*_;0pqLAA-4_h_Pdmk&!9Jy#fEBV2Zs8d@r4)BM&WHSgk?8~;jNDxSQ7pAWL=KPI_1~G>fg>zk~k1Q#yxwb+kP4a+ydl!?<~P`L2B#;n&duU4-`C-HII5dUnqo zySs$FgT_H0L%IXQ#%TU>0V3ee{NH$gP8@bsDDWCt;zL%9&?zV{7<_kO0B_ImI#9L# z4(WOdm!*uqZ1#rld%_j5S90R8Aiq%0ZHsvOmIc8+Y$9~VqC|&+vC7XP)SaG}X zfqMY_Ez<|2-aE|`U*gD64KOX1W9Y?G#$vXoQBDj;>E8x~MUS3UxUm9tr;cbvIL8`3 zYD;TEi{9PitVP%s&(Pk0X^l%}^w8X!gVC6G?18y#P7oy5| z?6jK$B(KrW67FFw0gcmh0HieLg$e+^t;Vc<55KKh*rz9l?X5$dfb{;f?}F>1YEwfmu|W9v1P3h4r7?NXO@8iwhlV-#i;) zSp?|iN2j<D#{#-9)YVl@j-GiBn9w_&yGrpzt=@9thtDZ#Huhg}_T{xW^Ia=pg zHT}eVTv>YN^RL6`a=idkAs`eL3gKx}-OS@DnxE}j4Swc_Dg04e`z$lJVQlj(|DRVK z1R{0Iecc%<{o8*0zn{1UQyIfm5Q$=Rn~_?drOs@l=GXFc2}9$7uaVtEH-YWTyMApg z&|@gLvz;PJI)5z=x3yw{PPMfGf}M2)zEUO+>RZ0>UP->YbzI^0qxKKM?iAjit-Bm> z!QR@CM}Rs%oi*b!VB+6kUjBeM@gY%6nF}YV zAyAA0P{9V~ACW>X#h#{yCwszNS@(w(ZR^+n&DKpm%oX_vR3>N=B2J^Y<&JRzZ=8X&| zNcs~%Ec3J5;#2GNZ3EWH9R&N*obYcGmW<8#56wZC(x=vtU?O)K%l^&}ewpV)@Hyq5 zxw@+mRse(v<6$iOf2p`gLXl2KWzd@`yZPRht|y z;nflfA!2cP4B6Vy`+>ofU@NmF|{TGxsoy7cd&VqN9jPi^0qd=zIUu?EY- zoMSU@t;7QfFSEOUEW4}yefg6T-0~A9H|ku4uIXYGm!H2c!s(Vh=PGVy4?aB4-<8gJ z1LnR)C@#W6cJh78GK1OIbAU>Y&qC7u$}(@4U|XcCw^s4#wZ4TTU$)`0Gf0;?Sa?XF zZx-Mdk;z{$ntRU*J+LM%FO`k%9f~J-gIl1mw9Gjc@xTe# zU4G^V@eBE37|ZF{tr29_uKOE;{9??zD-&QIFNrm!!}grOw1YA;t0AbOyM1o3l;r-^ zjd_%$vfrn?NN2z3$TRKd=%c1%#r*BS!#_+G1G7^zPa>b|1N2r z`Z%K>t2O(INGIe5jkkAr*n~Zct8dQ~NC$*Qc8(E8j+QsC|F-a|2p{31JuvF{mV+aj zdH0WF@B9y8vu@Z&paicBbVwKRje-{A?n&pJ;YuD>?V@d`=h%#$vlYJcNSn?NXyLng z*OU3qH)|87p%f~OE^nOEn~-b4f*G+tb59LPt?pHCpmevbmZ@YoRJ%f;O3aqADPE82{VdhmQXCqF|`KEkrNr5y||gK z@1f+Y-X(57ceUFe$W#r^_|v;xz7av8l*Q|`caCj4-4KblS+=gaE>-YZ>3Nh6 z__2M!l-sk8pl-&a+D*wj!HFlTd0NFBfI7olV~JR>zG6|UOgV5xBkuf>!v~5cEJ!T3 z3?R|WW5pA(;HxKZtHU5LuC~GtmYS1oTzu zaloy~dCZ>NQWw;6Ma((0$gbxI)N-I#ZdnbU=yo6&$jjY4cFXl5WD7^SsR{rhJhiUv zBQT9TihAk_zbMWeJwJ5)m7VQw*)N2^9+O2_+S}%&Qj!ES_{~FtL%G8Mna*}xZ}2;1 zGmQ(X893t7y93f+-uJsaPeSO8Me5_6gJR;T;Ul<5kRwBp>fVvQmC)p4F7d5h%1j&Z zGEPVs@<{;0-AjN3DJULDbSW*4Vob!LT_v6B{5mMrlASJk%mOhOLLOt~zk}L`eqPgd zT(AMRV$Eck2p@f0t_6UvH0)QC*vg_!XecTa-3>kYX8EPWuSaTkbkEE$nVJq8(rZ?z z7hku=IcMqwULM#D&WQ4@0jhes2&|c#9b*;fu5R8s_`;j*=YPY%m`$^c@0o1C%+OtU zyVZj$7SJ`IT#6;gAVQqMi+)ukhU+!%e&Ih(@jcrgwowU3Y_N0|#zp_h>&9@C>?c2p zqI2!ZzVz#(_}c~)L&onS5wJPDiOOFeW!1Qn?qpBZI*uZ5!KRZ#7}B7!`~`ky?*))s zx{V-1(ZV$~b=h+ocOl4?bBt>2T2FW%ex8stV!;c1!QNR??7ODD5=37Y{Wh@6rm2nj z62p?Bh5SgelZ`X`G;H?|i`Bt$NLVWa$ac_qg-Mt)lLxj3>a;6+E`yLFZA7)b1(RJC zc^$NPwyN4z|H=7(R8Bx`7JK~P>K^Q`?hBIq-|5X)Xv}bo-lW$!L2rPP*cckBI-uI8 H*5UsPXef_S literal 0 HcmV?d00001 From ab6078622d24dd4f8530dc610d7c846fcc4031ae Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Fri, 21 Oct 2022 20:45:53 +1000 Subject: [PATCH 14/16] More additions to documentation --- .../s4640439_siamese_network/README.MD | 7 ++++- .../s4640439_siamese_network/dataset.py | 20 +++++++++--- .../s4640439_siamese_network/modules.py | 5 --- .../s4640439_siamese_network/predict.py | 31 +++++++++++-------- recognition/s4640439_siamese_network/train.py | 11 +++++-- 5 files changed, 47 insertions(+), 27 deletions(-) diff --git a/recognition/s4640439_siamese_network/README.MD b/recognition/s4640439_siamese_network/README.MD index 600038b505..234b17c474 100644 --- a/recognition/s4640439_siamese_network/README.MD +++ b/recognition/s4640439_siamese_network/README.MD @@ -15,6 +15,7 @@ and testing splits of the data. ## How the Algorithm Works + ## Results Unfortunately, as of the current version, this implementation has failed to construct a suitable classifier. @@ -49,4 +50,8 @@ Requires numpy version 1.21.3 or above Original dataset sourced from: [ADNI dataset for Alzheimer's disease](http://adni.loni.usc.edu/) Pre-processed dataset (used in this project) available from: [UQ Blackboard](https://cloudstor.aarnet.edu.au/plus/s/L6bbssKhUoUdTSI) -### Example Usage +### Instructions +1. Run dataset.py - being sure to adjust path constants to match your personal setup +2. Run modules.py - adjusting image size and embedding shape as necessary +3. Run train.py - being sure to change the model save directory constant to where you would like to save your models +4. Run predict.py - being sure to change the path names for the test data diff --git a/recognition/s4640439_siamese_network/dataset.py b/recognition/s4640439_siamese_network/dataset.py index 03e3412b7f..93892a2663 100644 --- a/recognition/s4640439_siamese_network/dataset.py +++ b/recognition/s4640439_siamese_network/dataset.py @@ -3,10 +3,6 @@ import os import time -""" -Containing the data loader for loading and preprocessing your data. -""" - # Data has already been separated into training and test data AD_TEST_PATH = "E:/ADNI/AD_NC/test/AD/" AD_TRAIN_PATH = "E:/ADNI/AD_NC/train/AD/" @@ -67,4 +63,18 @@ def load_data(directory_path: str, prefix: str) -> np.ndarray: print("Loading preprocessed data") data = np.load(save_path) - return data \ No newline at end of file + return data + +def main(): + """ + Performs the first loading and pre-processing of the data. + + load_data() function saves the data to avoid these computations needing to be re-computed. + """ + training_data_positive = load_data(AD_TRAIN_PATH, "ad_train") + training_data_negative = load_data(NC_TRAIN_PATH, "nc_train") + test_data_positive = load_data(AD_TRAIN_PATH, "ad_test") + test_data_negative = load_data(NC_TRAIN_PATH, "nc_test") + +if __name__ == "__main__": + main() \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/modules.py b/recognition/s4640439_siamese_network/modules.py index 0582f6c37c..07bb1756eb 100644 --- a/recognition/s4640439_siamese_network/modules.py +++ b/recognition/s4640439_siamese_network/modules.py @@ -8,11 +8,6 @@ SIAMESE_OUTPUT_SHAPE = (512,) -""" -Containing the source code of the components of your model. -Each component must be implementated as a class or a function. -""" - def build_siamese(): """ Generate Siamese model diff --git a/recognition/s4640439_siamese_network/predict.py b/recognition/s4640439_siamese_network/predict.py index 4747e323b8..5d4e1fca14 100644 --- a/recognition/s4640439_siamese_network/predict.py +++ b/recognition/s4640439_siamese_network/predict.py @@ -1,22 +1,15 @@ -""" -Showing example usage of your trained model -Print out any results and / or provide visualisations where applicable -""" import tensorflow as tf from train import * from dataset import * -""" -TODO: -Load saved binary classification model from train.py -Load test data using dataset.py -Test the data and print/plot results -""" - TEST_DATA_POSITIVE_LOC = "ad_test" TEST_DATA_NEGATIVE_LOC = "nc_test" def main(): + """ + Used to load pre-trained models and then evaluate them using previously unseen test data + """ + # load testing data test_data_positive = load_data(AD_TRAIN_PATH, TEST_DATA_POSITIVE_LOC) test_data_negative = load_data(NC_TRAIN_PATH, TEST_DATA_NEGATIVE_LOC) @@ -25,9 +18,21 @@ def main(): siamese_model = tf.keras.models.load_model(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) binary_model = tf.keras.models.load_model(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) - + # generate labels - 1: positive, 0: negative + pos_labels = np.ones(test_data_positive.shape[0]) + neg_labels = np.zeros(test_data_negative.shape[0]) + + # convert image data to embeddings + pos_embeddings = siamese_model.predict(test_data_positive) + neg_embeddings = siamese_model.predict(test_data_negative) + + # merge positive and negative datasets + embeddings = np.concatenate((pos_embeddings, neg_embeddings)) + labels = np.concatenate((pos_labels, neg_labels)) + + results = binary_model.evaluate(embeddings, labels) - results = binary_model.evaluate() + print(results) if __name__ == "__main__": main() \ No newline at end of file diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 2b60be658c..9d4aac51ce 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -9,7 +9,6 @@ Containing the source code for training, validating, testing and saving your model. The model should be imported from “modules.py” and the data loader should be imported from “dataset.py” Make sure to plot the losses and metrics during training. - """ EPOCHS = 100 BATCH_SIZE = 64 @@ -18,6 +17,7 @@ MODEL_SAVE_DIR = "E:/ADNI/models" + def siamese_loss(x0, x1, label: int, margin: float) -> float: """ Custom loss function for siamese network. @@ -30,7 +30,7 @@ def siamese_loss(x0, x1, label: int, margin: float) -> float: Vectors of different classes are punished for being close and rewarded for being far away. Parameters: - - x0, x1 -- batch of vectors. Shape: (batch size, embedding size) + - x0, x1 -- tensor batch of vectors. Shape: (batch size, embedding size) - label -- whether or not the two vectors are from the same class. 1 = yes, 0 = no Returns: @@ -44,6 +44,7 @@ def siamese_loss(x0, x1, label: int, margin: float) -> float: return loss + @tf.function def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): """ @@ -78,6 +79,7 @@ def train_step(siamese, siamese_optimiser, images1, images2, same_class: bool): return loss + def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs) -> None: """ Trains the siamese model. @@ -130,6 +132,7 @@ def train_siamese_model(model, optimiser, pos_dataset, neg_dataset, epochs) -> N elapsed = time.time() - start print(f"Siamese Network Training Completed in {elapsed}") + def train_binary_classifier(model, siamese_model, training_data_positive, training_data_negative) -> None: """ Trains the binary classifier used to classify the images into one of the two classes. @@ -156,11 +159,13 @@ def train_binary_classifier(model, siamese_model, training_data_positive, traini embeddings = np.concatenate((pos_embeddings, neg_embeddings)) labels = np.concatenate((pos_labels, neg_labels)) - model.fit(embeddings, labels, epochs=EPOCHS, batch_size=BATCH_SIZE) + history = model.fit(embeddings, labels, epochs=EPOCHS, batch_size=BATCH_SIZE) elapsed = time.time() - start print(f"Binary Classifier Training Completed in {elapsed}") + return history + def main(): """ Trains the models From 3aa54294e1db6f80d154399f7533f1159f2c34ee Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Fri, 21 Oct 2022 20:55:16 +1000 Subject: [PATCH 15/16] Added principal component analysis to train.py - Also further documentation adjustments --- .../s4640439_siamese_network/README.MD | 3 + recognition/s4640439_siamese_network/train.py | 60 +++++++++++++++++-- 2 files changed, 58 insertions(+), 5 deletions(-) diff --git a/recognition/s4640439_siamese_network/README.MD b/recognition/s4640439_siamese_network/README.MD index 234b17c474..89a4a044ea 100644 --- a/recognition/s4640439_siamese_network/README.MD +++ b/recognition/s4640439_siamese_network/README.MD @@ -45,6 +45,9 @@ If more time were available, I would try to train the Siamese Model using variou Requires Python version 3.9 or above (for type hinting) Requires tensorflow version 2.8.2 or above Requires numpy version 1.21.3 or above +Requires matplotlib 3.4.3 or above +Requires pandas 1.3.2 or above +Requires scikit-learn 1.0.1 or above ### Dataset and Pre-processing Original dataset sourced from: [ADNI dataset for Alzheimer's disease](http://adni.loni.usc.edu/) diff --git a/recognition/s4640439_siamese_network/train.py b/recognition/s4640439_siamese_network/train.py index 9d4aac51ce..fc3b780981 100644 --- a/recognition/s4640439_siamese_network/train.py +++ b/recognition/s4640439_siamese_network/train.py @@ -4,12 +4,10 @@ from dataset import * import time import os +from sklearn.decomposition import PCA +import pandas as pd +import matplotlib.pyplot as plt -""" -Containing the source code for training, validating, testing and saving your model. -The model should be imported from “modules.py” and the data loader should be imported from “dataset.py” -Make sure to plot the losses and metrics during training. -""" EPOCHS = 100 BATCH_SIZE = 64 BUFFER_SIZE = 20000 @@ -166,6 +164,52 @@ def train_binary_classifier(model, siamese_model, training_data_positive, traini return history +def run_pca(siamese_model, training_data_positive, training_data_negative): + """ + Run Principle Component Analysis on the Siamese Model Embeddings and plot the + two features with the highest variance. + + Code adapted from https://towardsdatascience.com/pca-using-python-scikit-learn-e653f8989e60 + + Parameters: + - siamese_model -- The model with which to generate the embeddings to perform PCA on + - training_data_positive, training_data_negative -- raw image data as numpy arrays + """ + + pos_labels = np.ones(training_data_positive.shape[0]) + neg_labels = np.zeros(training_data_negative.shape[0]) + + pos_embeddings = siamese_model.predict(training_data_positive) + neg_embeddings = siamese_model.predict(training_data_negative) + + embeddings = np.concatenate((pos_embeddings, neg_embeddings)) + labels = np.concatenate((pos_labels, neg_labels)) + + pca = PCA(n_components=2) + principal_components = pca.fit_transform(embeddings) + + principalDf = pd.DataFrame(data = principal_components + , columns = ['principal component 1', 'principal component 2']) + labelsDf = pd.DataFrame(labels, columns=["label"]) + finalDf = pd.concat([principalDf, labelsDf], axis = 1) + + # plot first two principal components + fig = plt.figure(figsize = (8,8)) + ax = fig.add_subplot(1,1,1) + ax.set_xlabel('Principal Component 1', fontsize = 15) + ax.set_ylabel('Principal Component 2', fontsize = 15) + ax.set_title('2 component PCA', fontsize = 20) + targets = [1.0, 0.0] + colors = ['r', 'g'] + for target, color in zip(targets,colors): + indicesToKeep = finalDf['label'] == target + ax.scatter(finalDf.loc[indicesToKeep, 'principal component 1'] + , finalDf.loc[indicesToKeep, 'principal component 2'] + , c = color + , s = 50) + ax.legend(targets) + ax.grid() + def main(): """ Trains the models @@ -195,11 +239,17 @@ def main(): # train the models train_siamese_model(siamese_model, siamese_optimiser, train_data_pos, train_data_neg, EPOCHS) + + # optionally, run principle component analysis on siamese model to assess embeddings + run_pca(siamese_model, training_data_positive, training_data_negative) + train_binary_classifier(binary_classifier, siamese_model, training_data_positive, training_data_negative) # save the models siamese_model.save(os.path.join(MODEL_SAVE_DIR, "siamese_model.h5")) binary_classifier.save(os.path.join(MODEL_SAVE_DIR, "binary_model.h5")) + + if __name__ == "__main__": main() \ No newline at end of file From 1bc76598687a9e254f618926de11b405ff73d4b2 Mon Sep 17 00:00:00 2001 From: Matt McDonnell <85379302+bushlemon@users.noreply.github.com> Date: Fri, 21 Oct 2022 21:28:20 +1000 Subject: [PATCH 16/16] Finished Readme --- .../s4640439_siamese_network/README.MD | 20 ++++++++----------- 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/recognition/s4640439_siamese_network/README.MD b/recognition/s4640439_siamese_network/README.MD index 89a4a044ea..d13fdc1a32 100644 --- a/recognition/s4640439_siamese_network/README.MD +++ b/recognition/s4640439_siamese_network/README.MD @@ -1,20 +1,16 @@ -Requirements -1. The readme file should contain a title, a description of the algorithm and the problem that it solves -(approximately a paragraph), how it works in a paragraph and a figure/visualisation. -2. It should also list any dependencies required, including versions and address reproduciblility of results, -if applicable. -3. provide example inputs, outputs and plots of your algorithm -4. The read me file should be properly formatted using GitHub markdown -5. Describe any specific pre-processing you have used with references if any. Justify your training, validation -and testing splits of the data. - # Siamese Networks for Alzheimer's Disease Classification Using MRI Images ## Description and Problem - +This project aims to use ADNI brain MRI images to classify Alzheimer's disease. Using raw images, the algorithm outputs a prediction of whether or not the pictured brain has Alzheimer's disease. This is done using a combination of a Siamese Neural Network and a Binary Classifier Neural Netowrk. ## How the Algorithm Works +Siamese Networks are essentially a slightly modified version of a CNN. How they work is, you pass two images samples into the model, one after the other. The model transforms these images into vector embeddings. Then, a distance is computed between these vectors, using one of potentially many distance metrics. Ideally, two images of the same class are very close in distance and images of different classes are very far in distance. The loss function and optimiser then work to update the weightings to move towards the ideal behaviour. + +After the Siamese Network is trained, you then have a transformer which converts images to vector embeddings, keeping similar images close in distance together. + +You then use these embeddings to train a dense layered Binary Classifier. +With luck, your Binary Classifier can then be used to accurately predict images by first converting test images to embeddings, and then classifying them into either a positive or negative class. ## Results Unfortunately, as of the current version, this implementation has failed to construct a suitable classifier. @@ -38,7 +34,7 @@ Techniques attempted include: * Changing structure of model - size of convolutions, number of convolutions, strides, max padding * Changing the margin in the loss function in the range (0.1, 0.5) -If more time were available, I would try to train the Siamese Model using various other loss functions instead, for example Triplet Loss. +If more time were available, I would try to train the Siamese Model using various other loss functions instead, for example Triplet Loss. Additionally, I also would have liked to try different distance metrics for the embeddings. ## Running the Code ### Dependencies