It's possible that [numba](https://numba.pydata.org/) (which employs JIST compilation) might provide speed ups of the for loop in the hermite functions. In practice however the bottleneck in time is usually the moveaxis rather than the for loop.