This program implements U-Net architectures [1-3] for binary segmentation, where a model is trained to map each pixel in an image to the range
This project implements four types of configurable architectures:
- U-Net
- U-Net++
- U-Net w/ EfficientNetB7 backbone
- U-Net++ w/ EfficientNetB7 backbone
Models can either be generated by training, or they can applied for inference. There is also a
The program operations, model architecture, and hyperparameters are configured via input files discussed in configs/README.md
.
Currently, only training/inference on a single CUDA-enabled GPU is supported, but alternative setups may be considered by submitting an issue. Futhermore, the instructions provided here are for Linux environments only.
A Jupyter notebook version of this project, unet_compare.ipynb
, also exists which can be used with cloud services. Instructions for using it on Google Colab are provided in the first cell. A lightweight GPU can be used for free and more powerful GPUs are available through the purchase of computational units. Priority access is enabled through their subscription service.
Conda is used to contain the project and install the dependencies. If Conda is not already installed, it can be installed using Miniconda.
cd ~
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
# follow installation procedures
# restart shell to finish install
With Conda installed, the repository can be cloned and a virtual environment created.
git clone https://github.com/psu-rdmap/unet-compare.git
cd unet-compare
export CONDA_PKGS_DIRS=$PWD/env/conda_pkgs
conda create -p env python=3.11
conda activate env/
Next, ensure the default Python and pip binaries correspond to the virtual environment.
which python # /path/to/unet-compare/env/bin/python
which pip # /path/to/unet-compare/env/bin/pip
Dependencies can then be installed.
conda install cudatoolkit==11.8.0
pip install --upgrade pip
pip install -r requirements.txt
The repository should now be installed. The following line checks which devices Tensorflow recognizes:
python scripts/check_devices.py
This section gives information about running the source code. The first step is to create a dataset with the correct structure.
Two training datasets are supplied in the data/
directory. Each dataset has the following structure:
data/dataset_name
├── images
│ ├── 1.ext
│ ├── 2.ext
│ └── ...
└── annotations
├── 1.ext
├── 2.ext
└── ...
This format is for training and cross-validation. For inference, images can be directly placed in the dataset directory:
data/dataset_name
├── 1.ext
├── 2.ext
└── ...
Images are associated with their corresponding annotations by giving them the same filenames. All images or annotations must use a consistent file format. For example, all images could be .jpg
files, while all annotations could be .png
files.
The next step is to create a configs input file in configs/
following the instructions in the configs/README.md file.
The script run.py
is responsible for validating input configs and training models or inferencing with trained ones.
python src/run.py configs/<configs_file>.json
If you incorrectly defined any configs, an error will be thrown. Otherwise, operation will commence!
When you are finished, deactivate the environment
conda deactivate
Results will be placed in a generated directory with a unique name describing the run, or it can be given a specific name in the configs input file.
If training, each results directory will contain the following
- Copy of the input configs used
.keras
model file containing model structure and weights- Model predictions for training and validation images
- CSV file with loss and metrics calculated during training
- Loss and metrics plots by epoch
- Summaries of model weights and trainable layers (if enabled in configs)
- Results from each fold (cross validation only)
- Statistical plots of loss and metrics by epoch (cross validation only)
If inferencing, there will instead be
- Copy of the input configs used
- Model predictions
As discussed in [4], segmentation is only an intermediate, but challenging, step for TEM image data. For post-processing, where defects are identified and their properties calculated, there exists scripts/post_processor.py which implements three defect detection algorithms: Convex Hull and Approximate Contour (CHAC) [5] for identifying convex grains, BubbleFinder for identifying circular cavities, voids, or bubbles [6], and Bbox for fitting bounding boxes around contiguous white pixel regions.
post_processor.py
accepts the following input parameters:
algorithm = CHAC | BubbleFinder | Bbox
seg_dir
= Path to directory with input segmentation images relative to/path/to/unet-compare/
img_scale
= Ratio describing the number of nanometers per pixel, assuming every image inin_dir
has the same magnificationhistogram_bins
= Three space-delimitedlinspace
-like valuesstart stop num
describing the bins used for computing defect size histogramstem_dir
= (Optional) Path to directory with original TEM images to be overlayed onto relative to/path/to/unet-compare/
ann_dir
= (Optional) Path to directory with ground truth annotation images, if they exist, for additional ML model performance evaluations relative to/path/to/unet-compare/
results_dir
= (Optional) Override default path to directory for saving results relative to/path/to/unet-compare/
In return, it outputs the following:
- Post-processed segmentation images, optionally overlayed onto their TEM image counterpart
- CSV files with detected defects and their properties (size, location)
- Defect property histogram
- Text file wwith various information describing the run and output
There also exists some helpful python scripts in [scripts/][scripts/] for performing various tasks.
CM2.py
applies confusion matrix color mapping (CM2) to a set of model predictions, as discussed in [4], visualizing the prediction abilities of UNet models. It accepts the following input parameters:
tem_dir
= Path to directory with training images corresponding to files inpred_dir
relative to/path/to/unet-compare/
ann_dir
= Path to directory with ground truth annotation images corresponding to files inpred_dir
relative to/path/to/unet-compare/
pred_dir
= Path to directory with model predictions relative to/path/to/unet-compare/
results_dir
= Path to directory for saving CM2 images relative to/path/to/unet-compare/
downscale_images.py
can be used to downscale TEM and annotation images (which are usually large 4K images) to a smaller size using a Lanczos filter. It accepts the following input parameters:
in_dir
= Path to directory with input images relative to/path/to/unet-compare/
out_dir
= Path to directory for saving downscaled images relative to/path/to/unet-compare/
resize_shape
= Shape to downscale input images to like1024,1024
remove_models.py
can be used to remove any .keras
model files in a directory tree recursively. This is useful for exporting model performance and prediction data without also included the large model files. It accepts the following input parameter:
root_dir
= Path to the start of a directory tree where.keras
files will be removed
[1] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” MICCAI 2015. https://arxiv.org/abs/1505.04597
[2] Z. Zhou, M. Rahman, N. Tajbakhsh, and J. Liang, “UNet++: A Nested U-Net Architecture for Medical Image Segmentation,” DLMIA 2018. https://arxiv.org/abs/1807.10165
[3] M. Tan and Q. V. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” ICML 2019. https://arxiv.org/abs/1905.11946
[4] A. Ochoa, X. Xu, and X. Wang, "Improving U-Net Confidence on TEM Image Data with L2-Regularization, Transfer Learning, and Deep Fine-Tuning," ICCV-CV4MS 2025. https://arxiv.org/abs/2507.16779
[5] X. Xu, Z. Yu, W.-Y. Chen, A. Chen, A. Motta, and X. Wang, "Automated analysis of grain morphology in TEM images using convolutional neural network with CHAC algorithm," Journal of Nuclear Materials 2024. https://doi.org/10.1016/j.jnucmat.2023.154813
[6] X. Wang, K. Jin, C. Y. Wong, D. Chen, H. Bei, Y. Wang, M. Ziatdinov, W. J. Weber, Y. Zhang, J. Poplawsky, and K. L. More, Understanding effects of chemical complexity on helium bubble formation in Ni-based concentrated solid solution alloys based on elemental segregation measurements," Journal of Nuclear Materials 2022. https://doi.org/10.1016/j.jnucmat.2022.153902