Skip to content

Computer vision tool using Tensorflow and Keras to train U-Net and U-Net++ with the optional use of EfficientNetB7 as a pretrained backbone for binary segmentation of microscopy images

License

Notifications You must be signed in to change notification settings

psu-rdmap/unet-compare

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Overview

This program implements U-Net architectures [1-3] for binary segmentation, where a model is trained to map each pixel in an image to the range $[0,1]$. This was designed with transmission electron microscopy (TEM) images in mind, where the output pixel values are probabilities of belonging to a defect structure ($1$ is defect, $0$ is a non-defect). The ICCV paper this code was designed for is [4].

This project implements four types of configurable architectures:

  • U-Net
  • U-Net++
  • U-Net w/ EfficientNetB7 backbone
  • U-Net++ w/ EfficientNetB7 backbone

Models can either be generated by training, or they can applied for inference. There is also a $k$-fold cross validation feature where the dataset is divided into $k$ unique non-overlapping folds and a different model is trained for each one with the same hyperparameters. This permits a statistical observation of model performance.

The program operations, model architecture, and hyperparameters are configured via input files discussed in configs/README.md.

Installation

Currently, only training/inference on a single CUDA-enabled GPU is supported, but alternative setups may be considered by submitting an issue. Futhermore, the instructions provided here are for Linux environments only.

A Jupyter notebook version of this project, unet_compare.ipynb, also exists which can be used with cloud services. Instructions for using it on Google Colab are provided in the first cell. A lightweight GPU can be used for free and more powerful GPUs are available through the purchase of computational units. Priority access is enabled through their subscription service.

Conda is used to contain the project and install the dependencies. If Conda is not already installed, it can be installed using Miniconda.

cd ~
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# follow installation procedures
# restart shell to finish install

With Conda installed, the repository can be cloned and a virtual environment created.

git clone https://github.com/psu-rdmap/unet-compare.git
cd unet-compare
export CONDA_PKGS_DIRS=$PWD/env/conda_pkgs  
conda create -p env python=3.11
conda activate env/

Next, ensure the default Python and pip binaries correspond to the virtual environment.

which python   # /path/to/unet-compare/env/bin/python
which pip      # /path/to/unet-compare/env/bin/pip

Dependencies can then be installed.

conda install cudatoolkit==11.8.0
pip install --upgrade pip
pip install -r requirements.txt

The repository should now be installed. The following line checks which devices Tensorflow recognizes:

python scripts/check_devices.py

Training & Inference

This section gives information about running the source code. The first step is to create a dataset with the correct structure.

1. Creating a dataset

Two training datasets are supplied in the data/ directory. Each dataset has the following structure:

data/dataset_name
├── images
│   ├── 1.ext
│   ├── 2.ext
│   └── ...
└── annotations
    ├── 1.ext
    ├── 2.ext
    └── ...

This format is for training and cross-validation. For inference, images can be directly placed in the dataset directory:

data/dataset_name
├── 1.ext
├── 2.ext
└── ...

Images are associated with their corresponding annotations by giving them the same filenames. All images or annotations must use a consistent file format. For example, all images could be .jpg files, while all annotations could be .png files.

2. Running the program

The next step is to create a configs input file in configs/ following the instructions in the configs/README.md file.

The script run.py is responsible for validating input configs and training models or inferencing with trained ones.

python src/run.py configs/<configs_file>.json

If you incorrectly defined any configs, an error will be thrown. Otherwise, operation will commence!

When you are finished, deactivate the environment

conda deactivate

3. Results

Results will be placed in a generated directory with a unique name describing the run, or it can be given a specific name in the configs input file.

If training, each results directory will contain the following

  • Copy of the input configs used
  • .keras model file containing model structure and weights
  • Model predictions for training and validation images
  • CSV file with loss and metrics calculated during training
  • Loss and metrics plots by epoch
  • Summaries of model weights and trainable layers (if enabled in configs)
  • Results from each fold (cross validation only)
  • Statistical plots of loss and metrics by epoch (cross validation only)

If inferencing, there will instead be

  • Copy of the input configs used
  • Model predictions

4. TEM Segementation Post-Processing

As discussed in [4], segmentation is only an intermediate, but challenging, step for TEM image data. For post-processing, where defects are identified and their properties calculated, there exists scripts/post_processor.py which implements three defect detection algorithms: Convex Hull and Approximate Contour (CHAC) [5] for identifying convex grains, BubbleFinder for identifying circular cavities, voids, or bubbles [6], and Bbox for fitting bounding boxes around contiguous white pixel regions.

post_processor.py accepts the following input parameters:

  • algorithm = CHAC | BubbleFinder | Bbox
  • seg_dir = Path to directory with input segmentation images relative to /path/to/unet-compare/
  • img_scale = Ratio describing the number of nanometers per pixel, assuming every image in in_dir has the same magnification
  • histogram_bins = Three space-delimited linspace-like values start stop num describing the bins used for computing defect size histograms
  • tem_dir = (Optional) Path to directory with original TEM images to be overlayed onto relative to /path/to/unet-compare/
  • ann_dir = (Optional) Path to directory with ground truth annotation images, if they exist, for additional ML model performance evaluations relative to /path/to/unet-compare/
  • results_dir = (Optional) Override default path to directory for saving results relative to /path/to/unet-compare/

In return, it outputs the following:

  • Post-processed segmentation images, optionally overlayed onto their TEM image counterpart
  • CSV files with detected defects and their properties (size, location)
  • Defect property histogram
  • Text file wwith various information describing the run and output

Utility Scripts

There also exists some helpful python scripts in [scripts/][scripts/] for performing various tasks.

CM2.py applies confusion matrix color mapping (CM2) to a set of model predictions, as discussed in [4], visualizing the prediction abilities of UNet models. It accepts the following input parameters:

  • tem_dir = Path to directory with training images corresponding to files in pred_dir relative to /path/to/unet-compare/
  • ann_dir = Path to directory with ground truth annotation images corresponding to files in pred_dir relative to /path/to/unet-compare/
  • pred_dir = Path to directory with model predictions relative to /path/to/unet-compare/
  • results_dir = Path to directory for saving CM2 images relative to /path/to/unet-compare/

downscale_images.py can be used to downscale TEM and annotation images (which are usually large 4K images) to a smaller size using a Lanczos filter. It accepts the following input parameters:

  • in_dir = Path to directory with input images relative to /path/to/unet-compare/
  • out_dir = Path to directory for saving downscaled images relative to /path/to/unet-compare/
  • resize_shape = Shape to downscale input images to like 1024,1024

remove_models.py can be used to remove any .keras model files in a directory tree recursively. This is useful for exporting model performance and prediction data without also included the large model files. It accepts the following input parameter:

  • root_dir = Path to the start of a directory tree where .keras files will be removed

References

[1] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” MICCAI 2015. https://arxiv.org/abs/1505.04597

[2] Z. Zhou, M. Rahman, N. Tajbakhsh, and J. Liang, “UNet++: A Nested U-Net Architecture for Medical Image Segmentation,” DLMIA 2018. https://arxiv.org/abs/1807.10165

[3] M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” ICML 2019. https://arxiv.org/abs/1905.11946

[4] A. Ochoa, X. Xu, and X. Wang, "Improving U-Net Confidence on TEM Image Data with L2-Regularization, Transfer Learning, and Deep Fine-Tuning," ICCV-CV4MS 2025. https://arxiv.org/abs/2507.16779

[5] X. Xu, Z. Yu, W.-Y. Chen, A. Chen, A. Motta, and X. Wang, "Automated analysis of grain morphology in TEM images using convolutional neural network with CHAC algorithm," Journal of Nuclear Materials 2024. https://doi.org/10.1016/j.jnucmat.2023.154813

[6] X. Wang, K. Jin, C. Y. Wong, D. Chen, H. Bei, Y. Wang, M. Ziatdinov, W. J. Weber, Y. Zhang, J. Poplawsky, and K. L. More, Understanding effects of chemical complexity on helium bubble formation in Ni-based concentrated solid solution alloys based on elemental segregation measurements," Journal of Nuclear Materials 2022. https://doi.org/10.1016/j.jnucmat.2022.153902

About

Computer vision tool using Tensorflow and Keras to train U-Net and U-Net++ with the optional use of EfficientNetB7 as a pretrained backbone for binary segmentation of microscopy images

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •