Integration of the SINQ quantization strategy

### Feature request

Adding support for **SINQ** quantization for Hugging Face compatible models, enabling users to apply it directly through the configuration settings. The **SINQ** quantization method, recently introduced in the paper [SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights](https://huggingface.co/papers/2509.22944), has quickly gained significant attention. It demonstrates superior effectiveness compared to existing approaches such as HQQ, while also offering substantially faster quantization times.

### Motivation

Integrating the **SINQ** quantization algorithm into the Transformers library (as already done for _HQQ_, _AWQ_, _HIGGS_, ...) would allow users to easily quantize models by simply specifying the desired quantization method and parameters within the configuration settings, substituting the need to consult and directly use custom code from the [SINQ repository](https://github.com/huawei-csl/SINQ). This integration aims to streamline and simplify the quantization process while leveraging the existing features and infrastructure of the Transformers library. 

### Your contribution

I’m going to submit a pull request that includes the implementation and testing of the SINQ quantization integration. This integration enables users to specify the quantization method directly through the configuration, as shown below:
```bash
cfg = SinqConfig(
    nbits=4,
    group_size=64,
    tiling_mode="1D",
    method="sinq", 
    dtype="auto",
    modules_to_not_convert=["lm_head"],
    device="cuda:1"
)
```
Once the configuration is defined, the model can be quantized simply by calling the ```from_pretrained()``` function with the specified configuration settings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Integration of the SINQ quantization strategy #42116

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Integration of the SINQ quantization strategy #42116

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions