Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
53064d0
Removing changes from clone
leahvillr Oct 6, 2022
07ee4e8
Initial Commit
leahvillr Oct 6, 2022
64bc02e
Added a main.py for use later, implemented the models required for, a…
leahvillr Oct 9, 2022
e1bb5c5
Create .gitignore
leahvillr Oct 14, 2022
0044e34
refactored modules into a directory
leahvillr Oct 14, 2022
206d180
Update environment.yml
leahvillr Oct 14, 2022
24ac2e5
update git ignore
leahvillr Oct 14, 2022
106f0c0
Fixed some issues within the models
leahvillr Oct 20, 2022
3fd9842
Create interface.ipynb and use to test the VQVAE and generate a datal…
leahvillr Oct 20, 2022
2d9bd64
update README
leahvillr Oct 20, 2022
0dcc5b8
add train and test functions for pixelcnn
leahvillr Oct 21, 2022
e11659d
ad a generate samples function
leahvillr Oct 21, 2022
6019652
add dataloader for codebooks, add train test loops for pixelcnn
leahvillr Oct 21, 2022
d0dcd99
update datasets and pixelcnn
leahvillr Oct 21, 2022
c014c13
update interface and add ability to save VQVAE model including a work…
leahvillr Oct 21, 2022
b071652
updates to attempt to get pixel cnn working
leahvillr Oct 21, 2022
a4ab1cb
update the interface as well as add a graph for direct comparison of …
leahvillr Oct 21, 2022
cd78c5d
update readme and add images
leahvillr Oct 21, 2022
f7141b1
update readme and environment.yml
leahvillr Oct 21, 2022
bb25afe
remove unnecassary (unused) main.py
leahvillr Oct 21, 2022
9265dea
Merge branch 'topic-recognition' into topic-recognition
leahvillr Nov 18, 2022
4f96f0f
got rid of changes in other students commits
leahvillr Nov 18, 2022
c53ba0c
Merge branch 'topic-recognition' of github.com:alexvillr/PatternFlow …
leahvillr Nov 18, 2022
f6b5629
Hopefully have now gotten rid of changes from other student commits
leahvillr Nov 18, 2022
f7ae6d1
Update README.md
leahvillr Nov 18, 2022
29968ba
Fixing changes to other students work
leahvillr Nov 22, 2022
93f2b27
fixing this readme
leahvillr Nov 22, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified recognition/.DS_Store
Binary file not shown.
147 changes: 147 additions & 0 deletions recognition/45375325_VQVAE_for_image_creation/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
### Python template
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# PEP 582; used by e.g. github.com/David-OConnor/pyflow
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# data
/data
/data.zip

# mac things
/.DS_Store

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 4 additions & 0 deletions recognition/45375325_VQVAE_for_image_creation/.idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions recognition/45375325_VQVAE_for_image_creation/.idea/other.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions recognition/45375325_VQVAE_for_image_creation/.idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

116 changes: 116 additions & 0 deletions recognition/45375325_VQVAE_for_image_creation/README.MD
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# VQ-VAE for creation of images using the OASIS Brain Dataset

***

This is our implementation of the vector quantised variable auto encoder as depicted in the paper by members of DeepMind (1).

We used [this implementation](https://github.com/MishaLaskin/vqvae) by [MishaLaskin](https://github.com/MishaLaskin/) as inspiration to gain an understanding of how the code works.

***

## Usage

***

### Dependencies

- torch == 1.13.0.dev20220901
- torchvision == 0.14.0.dev20220901
- matplotlib == 3.6.0
- pillow == 9.2.0
- numpy == 1.23.3
- tqdm == 4.64.1
- scikit-learn == 1.1.2

Can also create a conda environment from the provided environment.yml with command (WARNING: This environment is one that I use for general work and as such is bloated with libraries not necassary for this module)

```console
conda env create -f environment.yml
```

or you can update the environment file with

```console
conda env export > environment.yml
```

Main difference for this environment is that this script was created using the nightly version of pytorch so that I could make use of the Apple Silicon gpu and mps acceleration. If you wanted to use the normal version of pytorch prior to them making mps acceleration available then for every definition of `DEVICE` delete the `'mps' if torch.has_mps else`. This should allow for the code to function normally on a cuda gpu.

### Reproducibility

To use this model for other datasets, place your data inside the data folder following pytorch documentation for producing a dataset using [ImageFolder](https://pytorch.org/vision/stable/generated/torchvision.datasets.ImageFolder.html#torchvision.datasets.ImageFolder).

If you want to make use of the current model seen in this readme skip the training and saving cells and simply load the model. After that run all the cells in order to see results. If you wish to maintain the saved model and save a new one for your other datasets then adjust the name that the model is saved under and run the save model cell.

***

## Training

***

### VQ-VAE

The Vector Quantized - Variational Autoencoder (VQVAE) is a network that makes use of the concept of an autoencoder. A model composed of two separate models, an encoder and a decoder. The encoder takes in an image and compresses the information down to a smaller vector known as the latent space. The decoder then takes the information from the latent space and generates the original image again. This comes with some information loss but that is often negligible when both the encoder and decoder have been trained properly.

The vector quantisation component is the ability to then take this latent space and turn what were all continuous values into discrete values creating a codebook in place of the latent space and training the decoder on this instead. This has been found to produce clearer images, reducing the information loss.

![image](./images/VQVAE-overview.png "VQVAE model overview")

Above describes the encoding process to a latent space as well as the quantisation step and then the decoding process.

The quantisation step makes use of L2 norm with the full equation defined below where $|| ... ||_2$ is defined as L2 norm

![image](./images/q_z%20equation.png)

This gives us a quantized representation of all the features of an image. This idea has been extended to 3d/environments, as well as sounds in (1)

The Models used for encoder and decoder as well as all other models can be found in the `modules` directory

#### Pixel CNN

Once an embedding space or codebook has been trained, we use a Pixel CNN to generate a codebook these generated codebooks are then parsed to the decoder to generate unique images. Typically the better trained the Pixel CNN the more unique items within the image are created.

The Pixel CNN model is defined as

![image](./images/PixelCNNOverview.png "Pixel CNN overview")

This involves 15 layers of the MaskedGatedConv2d

Please find the implementations in the modules module:

- `modules.decoder.py`
- `modules.encoder.py`
- `modules.quantizer.py`
- `modules.stack.py`
- `modules.vqvae.py`
- `modules.pixelcnn.py`

***

## Results (using OASIS brain dataset)

### VQ-VAE - results

The VQVAE was trained for 2 epochs and produced quite similar images, as can be seen below

![image](./images/reconstruction-vs-original.png)

There is some loss of quality and finer details visible in the reconstruction but the overall idea of the image remains.

### Pixel CNN - results

Unfortunately we were unable to get the pixel cnn working in time and as such don't have any results to show. You can however see the progress made as well as the errors found.

***

## Future improvements

Obviously it would be great if there were a working pixel cnn for generating new images to truly test the VQ-VAE and see its limitations.

The VQ-VAE could possible by trained for longer to see an increase in quality of reconstructions possibly giving the higher details.

There are certainly further optimisations possible throughout this project including supplying a less bloated environment file to work with.

## Sources

[1] van den Oord, A., Vinyals, O., & Kavukcuoglu, K. (2017). Neural Discrete Representation Learning. CoRR, abs/1711.00937. Opgehaal van <https://arxiv.org/abs/1711.00937>
Loading