Skip to content

[Bug]: client.query sometimes returns empty vector for the last entity, while data exists in DB #2986

@majorli

Description

@majorli

Is there an existing issue for this?

  • I have searched the existing issues

Describe the bug

Title: client.query sometimes returns empty vector for the last entity, while data exists in DB

Description:

When using client.query to retrieve rows that include a FLOAT_VECTOR field, the last entity in the result may contain an empty vector ([]). However, the same entity queried with client.get or via query_iterator returns the correct vector. This suggests a bug in the client-side deserialization of query responses.

Minimal reproducible example:

from pymilvus import MilvusClient, DataType
import uuid
import numpy as np

# 1. Connect to Milvus
cli = MilvusClient("http://localhost:19530")

collection_name = "repro_chunks"

# 2. Clean up and create collection
if cli.has_collection(collection_name):
    cli.drop_collection(collection_name)

cli.create_collection(
    collection_name,
    fields=[
        {"name": "id", "type": DataType.VARCHAR, "is_primary": True, "max_length": 64},
        {"name": "doc_id", "type": DataType.VARCHAR, "max_length": 64},
        {"name": "dense_vec", "type": DataType.FLOAT_VECTOR, "dim": 8},
    ],
)

# 3. Insert some rows (use small vectors, 8D)
doc_id = str(uuid.uuid4())
rows = [
    {
        "id": str(uuid.uuid4()),
        "doc_id": doc_id,
        "dense_vec": np.random.rand(8).tolist(),
    }
    for _ in range(5)
]

cli.insert(collection_name=collection_name, data=rows)

# 4. Query all vectors by doc_id with `query`
res = cli.query(
    collection_name=collection_name,
    output_fields=["dense_vec"],
    filter=f'doc_id in ["{doc_id}"]',
)

print("Length:", len(res))
print("Last row:", res[-1])

# Check for empty vectors
empties = [r for r in res if not r["dense_vec"]]
print("Empty vectors found:", empties)

# 5. Double-check same entity via `get`
last_id = res[-1]["id"]
res2 = cli.get(collection_name=collection_name, ids=[last_id], output_fields=["dense_vec"])
print("get() result for last id:", res2)

# 6. Now try query_iterator (works fine)
results = []
iterator = cli.query_iterator(
    collection_name=collection_name,
    output_fields=["dense_vec"],
    filter=f'doc_id in ["{doc_id}"]',
    batch_size=2,
)
while batch := iterator.next():
    results.extend(batch)
iterator.close()

print("query_iterator last row:", results[-1])

Observed behavior:

cli.query returns an entity with "dense_vec": [].

cli.get for the same entity returns the correct non-empty vector.

cli.query_iterator also returns the correct vector.

Expected Behavior

All methods should consistently return the stored vector.

Steps/Code To Reproduce behavior

res = cli.query(
    collection_name=collection_name,
    output_fields=["dense_vec"],
    filter=f'doc_id in ["{doc_id}"]',
)

Environment details

- Hardware/Softward conditions (Ubuntu 22.04 Server):
- Method of installation (Docker, Standalone):
- Milvus version (v2.5.6):
- Milvus configuration (Unchanged after installation):
- pymilvus version: 2.6.0
- Python version: 3.12

Anything else?

The bug always seems to affect the last entity in the result set.

No vectors in the collection are actually empty.

This points to a client-side bug in deserialization of the query response (possibly buffer boundary issue).

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions