-
Notifications
You must be signed in to change notification settings - Fork 310
Open
Description
Creating a halved version of bge-large using the following python code:
hf_model = AutoModel.from_pretrained("BAAI/bge-large-en-v1.5")
hf_model.half()
hf_model.save_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained("BAAI/bge-large-en-v1.5")
tokenizer.save_pretrained(tokenizer_path)
Seems to work just fine. However, loading this model using MLXEmbedders
:
let config = MLXEmbedders.ModelConfiguration(directory: url)
let model = try await MLXEmbedders.loadModelContainer(configuration: config)
Will produce NaN embeddings for certain texts, when used as follows:
let embedding = await model.perform { (model: EmbeddingModel, tokenizer, _) -> [Double] in
let inputTokens = tokenizer.encode(text: source, addSpecialTokens: true)
let padded = MLXArray(inputTokens)
let mask = MLXArray.ones(like: padded).asType(.bool)
let tokenTypes = MLXArray.zeros(like: padded)
let modelOutput = model(padded.expandedDimensions(axis: 0), positionIds: nil, tokenTypeIds: tokenTypes.expandedDimensions(axis: 0), attentionMask: mask.expandedDimensions(axis: 0))
let pooler = Pooling(strategy: .first)
let result = pooler(modelOutput, normalize: true, applyLayerNorm: false)
result.eval()
let squeezed = result.squeezed()
return vDSP.floatToDouble(squeezed.asArray(Float.self))
}
The same halved model files, using the same text, in Python, works fine:
model = EmbeddingModel(model_path=model_path,
pooling_strategy="first",
normalize=True,
max_length=512)
embs = model.encode(texts, show_progress=False)
Using the original model in Swift (via `MLXEmbedders.loadModelContainer(configuration: .bge_large) works fine.
The model files are too large to attach to an issue, see rdar://154959818 for details.
Metadata
Metadata
Assignees
Labels
No labels