-
Notifications
You must be signed in to change notification settings - Fork 51
Models
The models are independently implemented as single .py files and can be found under
nmtpytorch/models
A model implements a set of methods that can be seen in the basic NMT model. In order to
implement a new model, you have two options:
- Derive your
classfrom theNMTclass (SeeMNMTDecinit) - Copy
nmt.pyunder a different filename and rewrite all the methods. This is suitable if your model is substantially different than the basicNMTmodel and there's no interest in deriving from it.
After creating your model, add the necessary import into nmtpytorch/models/__init__.py. The class name of your model is what allows nmtpy to import and use it during training and inference.
To sum up,
- Implement your model as a
classcalledMyModelundernmtpytorch/models/mymodel.py - Import it inside
nmtpytorch/models/__init__.pyasfrom .mymodel import MyModel - Create an experiment configuration file and set
model_type: MyModelinside it.
A Conditional-GRU based NMT similar to the dl4mt-tutorial architecture.
Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention." International Conference on Machine Learning. 2015.
Caglayan, Ozan, Loïc Barrault, and Fethi Bougares. "Multimodal attention for neural machine translation." arXiv preprint arXiv:1609.03976 (2016).
This model uses raw image files as inputs and implements an end-to-end pipeline with CNN from torchvision.
A modification of the above model that is less memory hungry as this uses pre-extracted convolutional CNN features instead of embedding the CNN inside.
Visually initialized conditional-GRU variant from:
Caglayan, Ozan, et al. "LIUM-CVC Submissions for WMT17 Multimodal Translation Task." Proceedings of the Second Conference on Machine Translation. 2017.
nmtpytorch is developed in Informatics Lab / Le Mans University - France