You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thank you for providing the fine-tuned models in the repository. I used the inference_alpaca.py code to evaluate the FLAN-T5-XL and FLAN-T5-large models on simulation dataset. However, the F1 score that I am getting are lower than what has been reported in the repository. Can you tell me if there is some setting that needs to be changed?
Following are the number that I am getting on running the inference:
FLAN-T5-large (reported) | 57.3. | 50.1 | 70.5
FLAN-T5-large (obtained) | 53. | 49 | 57