Skip to content

Training and validation loss curve remains flat when trained on MVTec data  #63

@vishwanathvenkat

Description

@vishwanathvenkat

Greetings,
I trained the network on MVTEC tile data (Tried on various other datasets as well). The details are as follows

Command

python <Root folder>/train.py -d <data folder>/capsule -a mvtecCAE -l ssim -b 32

config.py

ROT_ANGLE = 5
W_SHIFT_RANGE = 0.05
H_SHIFT_RANGE = 0.05
FILL_MODE = "nearest"
BRIGHTNESS_RANGE = [0.95, 1.05]
VAL_SPLIT = 0.2

# Learning Rate Finder parameters
START_LR = 1e-7
LR_MAX_EPOCHS = 10
LRF_DECREASE_FACTOR = 0.85

# Training parameters
EARLY_STOPPING = 12
REDUCE_ON_PLATEAU = 6

# Finetuning parameters
FINETUNE_SPLIT = 0.1
STEP_MIN_AREA = 5
START_MIN_AREA = 5
STOP_MIN_AREA = 1005

Env

appdirs==1.4.4
argon2-cffi==20.1.0
astunparse==1.6.3
async-generator==1.10
attrs==20.3.0
backcall==0.2.0
black==19.10b0
bleach==3.2.0
CacheControl==0.12.6
cachetools==4.2.1
cchardet==2.1.7
certifi==2020.12.5
cffi==1.14.5
chardet==4.0.0
click==7.1.2
colorama==0.4.3
contextlib2==0.6.0
cycler==0.10.0
decorator==4.4.2
distlib==0.3.0
distro==1.4.0
fastprogress==1.0.0
filelock==3.0.12
flatbuffers==1.12
fvcore==0.1.3.post20210226
gast==0.3.3
google-auth==1.27.0
google-auth-oauthlib==0.4.2
google-pasta==0.2.0
grpcio==1.32.0
h5py==2.10.0
html5lib==1.0.1
idna==2.10
imageio==2.9.0
iopath==0.1.4
ipaddr==2.2.0
ipython==7.21.0
ipython-genutils==0.2.0
jedi==0.18.0
jieba==0.42.1
joblib==1.0.1
Keras==2.4.3
keras-bert==0.86.0
keras-embed-sim==0.8.0
keras-layer-normalization==0.14.0
keras-multi-head==0.27.0
keras-pos-embd==0.11.0
keras-position-wise-feed-forward==0.6.0
Keras-Preprocessing==1.1.2
keras-self-attention==0.46.0
keras-transformer==0.38.0
kiwisolver==1.3.1
ktrain==0.25.4
langdetect==1.0.8
lockfile==0.12.2
Markdown==3.3.4
matplotlib==3.3.4
msgpack==0.6.2
networkx==2.5
numpy==1.20.1
oauthlib==3.1.0
opt-einsum==3.3.0
packaging==20.9
pandas==1.2.2
parso==0.8.1
pathspec==0.8.1
pep517==0.8.2
pexpect==4.8.0
pickleshare==0.7.5
Pillow==8.1.1
pkg-resources==0.0.0
portalocker==2.2.1
progress==1.5
prompt-toolkit==3.0.16
protobuf==3.15.3
ptyprocess==0.7.0
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.20
Pygments==2.8.0
pyparsing==2.4.7
python-dateutil==2.8.1
pytoml==0.1.21
pytz==2021.1
PyWavelets==1.1.1
PyYAML==5.4.1
regex==2020.11.13
requests==2.25.1
requests-oauthlib==1.3.0
retrying==1.3.3
rsa==4.7.2
sacremoses==0.0.43
scikit-image==0.18.1
scikit-learn==0.23.2
scipy==1.6.1
sentencepiece==0.1.91
seqeval==0.0.19
six==1.15.0
SSIM-PIL==1.0.12
syntok==1.3.1
tabulate==0.8.9
tensorboard==2.4.1
tensorboard-plugin-wit==1.8.0
tensorflow-estimator==2.4.0
tensorflow-gpu==2.4.1
termcolor==1.1.0
threadpoolctl==2.1.0
tifffile==2021.2.26
tokenizers==0.9.3
toml==0.10.2
tqdm==4.58.0
traitlets==5.0.5
transformers==3.5.1
typed-ast==1.4.2
typing-extensions==3.7.4.3
urllib3==1.26.3
wcwidth==0.2.5
webencodings==0.5.1
Werkzeug==1.0.1
Whoosh==2.7.4
wrapt==1.12.1
yacs==0.1.8

Machine details

Ubuntu 20.04.2 LTS
Memory: 15.5GB
GPU: NVIDIA Corporation GM107M [GeForce GTX 960M]

Issue

No significant learning is happening
loss_plot

Sample output

crack_000_inspection

Additional info

lr_plot

I am getting similar results on all other cases in Mvtec dataset too.
Having other questions like why is val loss less than training loss, etc.

I am getting a feeling that I am not setting something right.
Kindly help.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions