-
Notifications
You must be signed in to change notification settings - Fork 376
Description
Is there an existing issue for this?
- I have searched the existing issues
Describe the bug
Title: client.query sometimes returns empty vector for the last entity, while data exists in DB
Description:
When using client.query
to retrieve rows that include a FLOAT_VECTOR
field, the last entity in the result may contain an empty vector ([]
). However, the same entity queried with client.get
or via query_iterator
returns the correct vector. This suggests a bug in the client-side deserialization of query responses.
Minimal reproducible example:
from pymilvus import MilvusClient, DataType
import uuid
import numpy as np
# 1. Connect to Milvus
cli = MilvusClient("http://localhost:19530")
collection_name = "repro_chunks"
# 2. Clean up and create collection
if cli.has_collection(collection_name):
cli.drop_collection(collection_name)
cli.create_collection(
collection_name,
fields=[
{"name": "id", "type": DataType.VARCHAR, "is_primary": True, "max_length": 64},
{"name": "doc_id", "type": DataType.VARCHAR, "max_length": 64},
{"name": "dense_vec", "type": DataType.FLOAT_VECTOR, "dim": 8},
],
)
# 3. Insert some rows (use small vectors, 8D)
doc_id = str(uuid.uuid4())
rows = [
{
"id": str(uuid.uuid4()),
"doc_id": doc_id,
"dense_vec": np.random.rand(8).tolist(),
}
for _ in range(5)
]
cli.insert(collection_name=collection_name, data=rows)
# 4. Query all vectors by doc_id with `query`
res = cli.query(
collection_name=collection_name,
output_fields=["dense_vec"],
filter=f'doc_id in ["{doc_id}"]',
)
print("Length:", len(res))
print("Last row:", res[-1])
# Check for empty vectors
empties = [r for r in res if not r["dense_vec"]]
print("Empty vectors found:", empties)
# 5. Double-check same entity via `get`
last_id = res[-1]["id"]
res2 = cli.get(collection_name=collection_name, ids=[last_id], output_fields=["dense_vec"])
print("get() result for last id:", res2)
# 6. Now try query_iterator (works fine)
results = []
iterator = cli.query_iterator(
collection_name=collection_name,
output_fields=["dense_vec"],
filter=f'doc_id in ["{doc_id}"]',
batch_size=2,
)
while batch := iterator.next():
results.extend(batch)
iterator.close()
print("query_iterator last row:", results[-1])
Observed behavior:
cli.query
returns an entity with "dense_vec": []
.
cli.get
for the same entity returns the correct non-empty vector.
cli.query
_iterator also returns the correct vector.
Expected Behavior
All methods should consistently return the stored vector.
Steps/Code To Reproduce behavior
res = cli.query(
collection_name=collection_name,
output_fields=["dense_vec"],
filter=f'doc_id in ["{doc_id}"]',
)
Environment details
- Hardware/Softward conditions (Ubuntu 22.04 Server):
- Method of installation (Docker, Standalone):
- Milvus version (v2.5.6):
- Milvus configuration (Unchanged after installation):
- pymilvus version: 2.6.0
- Python version: 3.12
Anything else?
The bug always seems to affect the last entity in the result set.
No vectors in the collection are actually empty.
This points to a client-side bug in deserialization of the query response (possibly buffer boundary issue).