ggml : repack block_iq4_nlx8 (AVX) #14904

ggerganov · 2025-07-27T15:56:47Z

Repack 8x block_iq4_nl into block_iq4_nlx8 + add AVX implementation

Reuse the existing block_q4_0x8 GEMV/GEMM implementation (the logic is the same, just the lookup table for nibbles -> bytes is different)
Cleanup some UNUSED macros (not exhaustive)

TODOs:

Test the __AVX512F__ path after the refactoring

ggerganov · 2025-07-30T10:51:12Z

@Srihari-mcw Since you have access to AVX512, could you run this branch with an iq4_nl quantizaion and verify that the perplexity is within norm?

Srihari-mcw · 2025-07-30T10:53:11Z

@Srihari-mcw Since you have access to AVX512, could you run this branch with an iq4_nl quantizaion and verify that the perplexity is within norm?

Sure, will check and get back on the same. Thanks

ggml-ci

Srihari-mcw · 2025-08-01T05:13:22Z

Hi @ggerganov , we tested the model for perplexity with meta llama2 7B model quantized to 'IQ4_NL' and observed the following perplexity in AVX512 Machine (AMD Ryzen 5 7600X). The perplexity seem close enough

model	perplexity (Final estimate PPL)	Commit id
llama 7B IQ4_NL	5.8822 +/- 0.03282	Base - 00131d6e
llama 7B IQ4_NL	5.8828 +/- 0.03283	PR Branch - d1788b72

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Jul 27, 2025

ggerganov mentioned this pull request Jul 28, 2025

repack : optimize mul_mat_id path #14918

Open

1 task

ggml : repack block_iq4_nlx8

d1788b7

ggml-ci

ggerganov force-pushed the gg/repack-iq4_nl-avx2 branch from e2661ed to d1788b7 Compare July 30, 2025 12:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml : repack block_iq4_nlx8 (AVX) #14904

ggml : repack block_iq4_nlx8 (AVX) #14904

ggerganov commented Jul 27, 2025 •

edited

Loading

Uh oh!

ggerganov commented Jul 30, 2025

Uh oh!

Srihari-mcw commented Jul 30, 2025 •

edited

Loading

Uh oh!

Srihari-mcw commented Aug 1, 2025

Uh oh!

Uh oh!

ggml : repack block_iq4_nlx8 (AVX) #14904

Are you sure you want to change the base?

ggml : repack block_iq4_nlx8 (AVX) #14904

Conversation

ggerganov commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ggerganov commented Jul 30, 2025

Uh oh!

Srihari-mcw commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Srihari-mcw commented Aug 1, 2025

Uh oh!

Uh oh!

ggerganov commented Jul 27, 2025 •

edited

Loading

Srihari-mcw commented Jul 30, 2025 •

edited

Loading