Skip to content

Halved bge-large model gives NaN embeddings when used from Swift #356

@jrturton

Description

@jrturton

Creating a halved version of bge-large using the following python code:

hf_model = AutoModel.from_pretrained("BAAI/bge-large-en-v1.5")
hf_model.half()
hf_model.save_pretrained(model_path)

tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-large-en-v1.5")
tokenizer.save_pretrained(tokenizer_path)

Seems to work just fine. However, loading this model using MLXEmbedders:

let config = MLXEmbedders.ModelConfiguration(directory: url)
let model = try await MLXEmbedders.loadModelContainer(configuration: config)

Will produce NaN embeddings for certain texts, when used as follows:

let embedding = await model.perform { (model: EmbeddingModel, tokenizer, _) -> [Double] in
            let inputTokens = tokenizer.encode(text: source, addSpecialTokens: true)
            let padded = MLXArray(inputTokens)
            let mask = MLXArray.ones(like: padded).asType(.bool)
            let tokenTypes = MLXArray.zeros(like: padded)
            let modelOutput = model(padded.expandedDimensions(axis: 0), positionIds: nil, tokenTypeIds: tokenTypes.expandedDimensions(axis: 0), attentionMask: mask.expandedDimensions(axis: 0))
            let pooler = Pooling(strategy: .first)
            let result = pooler(modelOutput, normalize: true, applyLayerNorm: false)
            result.eval()
            let squeezed = result.squeezed()
            return vDSP.floatToDouble(squeezed.asArray(Float.self))
        }

The same halved model files, using the same text, in Python, works fine:

model = EmbeddingModel(model_path=model_path,
                       pooling_strategy="first",
                       normalize=True,
                       max_length=512)
embs = model.encode(texts, show_progress=False)

Using the original model in Swift (via `MLXEmbedders.loadModelContainer(configuration: .bge_large) works fine.

The model files are too large to attach to an issue, see rdar://154959818 for details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions