imatrix : use GGUF by default #14842

compilade · 2025-07-24T02:37:07Z

The GGUF format for imatrix (added in #9400) is a saner default. The old imatrix.dat format doesn't store the per-expert evaluation counts for MoE models, which would make future improvements like #9400 (comment) less accurate.

Previously, the behavior was to only use GGUF when the output filename ended with .gguf. That is too strict in some cases (e.g. when using an additional ~ suffix to mark temporary files), and can also lead to people using the legacy format accidentally.

Since the GGUF-based imatrix format is very close to the internal state of llama-imatrix, converting to imatrix.dat format from imatrix.gguf is the same as directly generating the imatrix.dat file (but the reverse is not necessarily true (e.g. for MoE models), due to evaluation counts shape not present in imatrix.dat).

llama-quantize already doesn't use the imatrix filename to guess its type; it attempts to load as GGUF and fallbacks to the other format when it fails, so the name of the imatrix file doesn't technically matter.

The new default imatrix output format is GGUF regardless of the output filename. The legacy imatrix.dat format can be produced with --output-format dat.

Make sure to read the contributing guidelines before submitting a PR

Still uses the old format when the output filename ends with .dat but this can be overridden with --output-format

compilade · 2025-07-24T02:58:04Z

tools/imatrix/imatrix.cpp

-    // TODO: use the new format in more cases
-    if (!string_ends_with(fname, ".gguf")) {
-        LOG_WRN("\n%s: saving to legacy imatrix format because output suffix is not .gguf\n", __func__);
+    if ((imat_type == COMMON_IMATRIX_FORMAT_AUTO && string_ends_with(fname, ".dat")) ||


It might be better to instead simply use GGUF regardless of the file name by default.

I don't know why I'm hesitating.

Generating new imatrix.dat has limited uses (however, reading has many uses). The main user who would benefit doesn't really use mainline llama.cpp for this anymore (see ikawrakow/ik_llama.cpp#15 (reply in thread)).

This simplification could also remove the need for the common_imatrix_format_type enum, which could be a bool instead.

EDIT: I've changed this in 1ef3cc1, the format is no longer decided with the output filename.

The legacy format can only be produced with --output-format dat

CISC

LGTM, if you're still hesitant, add a warning if filename doesn't end with .gguf?

* imatrix : use GGUF by default * imatrix : use GGUF regardless of the output filename The legacy format can only be produced with --output-format dat

imatrix : use GGUF by default

53f65c3

Still uses the old format when the output filename ends with .dat but this can be overridden with --output-format

github-actions bot added the examples label Jul 24, 2025

compilade commented Jul 24, 2025

View reviewed changes

imatrix : use GGUF regardless of the output filename

1ef3cc1

The legacy format can only be produced with --output-format dat

CISC approved these changes Jul 24, 2025

View reviewed changes

CISC merged commit d31192b into master Aug 3, 2025
47 checks passed

This was referenced Aug 4, 2025

Bug: imatrix quantization failing for nvidia Nemotron 49B v1.5 ikawrakow/ik_llama.cpp#659

Open

imatrix : use GGUF to store importance matrices #9400

Merged

imatrix : warn when GGUF imatrix is saved without .gguf suffix #15076

Merged

compilade mentioned this pull request Aug 11, 2025

Bug: imatrix is now encapsulated in a GGUF file in mainline. ikawrakow/ik_llama.cpp#664

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

imatrix : use GGUF by default #14842

imatrix : use GGUF by default #14842

Uh oh!

compilade commented Jul 24, 2025 •

edited

Loading

Uh oh!

compilade Jul 24, 2025 •

edited

Loading

Uh oh!

CISC left a comment

Uh oh!

Uh oh!

Uh oh!

imatrix : use GGUF by default #14842

imatrix : use GGUF by default #14842

Uh oh!

Conversation

compilade commented Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

compilade Jul 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CISC left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

compilade commented Jul 24, 2025 •

edited

Loading

compilade Jul 24, 2025 •

edited

Loading