Thanks for the code! I'm wondering why using stochastic network in the feature extractor, and why adding noise in StatisticsNet.