[BUG] GPTQ-quantized OVIS 1B model yields poor performance & misaligned outputs in vLLM-0.9.1

https://huggingface.co/AIDC-AI/Ovis2-2B-GPTQ-Int4

The OVIS 1B model quantized using the above GPTQ code performs extremely poorly when accelerated with vLLM-0.9.1, and the output precision is completely inconsistent. What could be the reason?