Support intern-s1 #14875

RunningLeon · 2025-07-25T11:33:20Z

Support internlm/Intern-S1

convert_hf_to_gguf.py

CISC · 2025-07-29T09:22:11Z

The Python Type-Check CI needs to be resolved.

RunningLeon · 2025-07-30T06:16:05Z

The Python Type-Check CI needs to be resolved.

@CISC hi, could you tell how to fix this error? Seems not reasonable to me

/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3219:23 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3220:13 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3220:49 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:32[21](https://github.com/ggml-org/llama.cpp/actions/runs/16612224904/job/46997396567?pr=14875#step:5:22):23 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3[22](https://github.com/ggml-org/llama.cpp/actions/runs/16612224904/job/46997396567?pr=14875#step:5:23)2:13 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
/home/runner/work/llama.cpp/llama.cpp/convert_hf_to_gguf.py:3222:49 - error: Object of type "None" is not subscriptable (reportOptionalSubscript)
Error: Object of type "None" is not subscriptable (reportOptionalSubscript)
6 errors, 0 warnings, 14 informations
Error: 6 errors

CISC · 2025-07-30T06:40:02Z

The Python Type-Check CI needs to be resolved.

@CISC hi, could you tell how to fix this error? Seems not reasonable to me

Running pyright locally helps, the line numbers are wrong for some reason, this is the actual erroneous codeblock:

llama.cpp/convert_hf_to_gguf.py

Lines 3002 to 3005 in 5eba3e3

    
           if isinstance(self.hparams_vision['image_size'], list): 
        
               self.hparams_vision['image_size'] = self.hparams_vision['image_size'][0] 
        
           if isinstance(self.hparams_vision['patch_size'], list): 
        
               self.hparams_vision['patch_size'] = self.hparams_vision['patch_size'][0]

convert_hf_to_gguf.py

gguf-py/gguf/tensor_mapping.py

CISC · 2025-07-31T13:17:54Z

convert_hf_to_gguf.py

+            self._set_vocab_gpt2()
+
+    def _set_vocab_interns1(self):
+        tokens, toktypes, tokpre = self.get_vocab_base()


This does not work because Intern-S1 requires custom code, you must handle that here and not call base class get_vocab_base.

The Intern-S1 tokenizer looks like it's fairly special, so I think it requires custom handling to work as intended, not just using AutoTokenizer.

@CISC Hi, thanks for your reminder. Indeed, the intern-s1 tokenizer is special. It bases on Qwen3 bpe tokenizer, and expands with three spm tokenizer models. It uses some regex patterns to match to which sub vocab to use when tokenizing. Don't know how to implement it in llama.cpp. Do you have any suggestion for this special case? THX

No easy feat I'm afraid, SMILES especially, you will have to add a special case for it to do the sub-vocab matching and implement tokenizer in llama-vocab.cpp.

Agree. After consideration, the sub-vocab feat would be not added in this PR.

RunningLeon added 2 commits July 16, 2025 20:55

support internvl

7cf5c4c

support interns1

859796e

github-actions bot added the python python script changes label Jul 25, 2025

CISC reviewed Jul 25, 2025

View reviewed changes

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

ngxson requested changes Jul 25, 2025

View reviewed changes

convert_hf_to_gguf.py Show resolved Hide resolved

convert_hf_to_gguf.py Outdated Show resolved Hide resolved

resolve comments

483ffef

put interns1 in tensor mapping

5eba3e3

CISC reviewed Jul 30, 2025

View reviewed changes

convert_hf_to_gguf.py Show resolved Hide resolved

ngxson reviewed Jul 30, 2025

View reviewed changes

convert_hf_to_gguf.py Show resolved Hide resolved

ngxson reviewed Jul 30, 2025

View reviewed changes

gguf-py/gguf/tensor_mapping.py Outdated Show resolved Hide resolved

resolve comment

c71543c

CISC reviewed Jul 31, 2025

View reviewed changes

move tokenizer changes to sub class

490a13f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support intern-s1 #14875

Support intern-s1 #14875

Uh oh!

RunningLeon commented Jul 25, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Jul 29, 2025

Uh oh!

RunningLeon commented Jul 30, 2025

Uh oh!

CISC commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC Jul 31, 2025

Uh oh!

RunningLeon Aug 1, 2025

Uh oh!

CISC Aug 1, 2025

Uh oh!

RunningLeon Aug 4, 2025

Uh oh!

Uh oh!

Support intern-s1 #14875

Are you sure you want to change the base?

Support intern-s1 #14875

Uh oh!

Conversation

RunningLeon commented Jul 25, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Jul 29, 2025

Uh oh!

RunningLeon commented Jul 30, 2025

Uh oh!

CISC commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC Jul 31, 2025

Choose a reason for hiding this comment

Uh oh!

RunningLeon Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

CISC Aug 1, 2025

Choose a reason for hiding this comment

Uh oh!

RunningLeon Aug 4, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!