milvus: [Bug]: Range Search Limitied to 6400 Results

Is there an existing issue for this?

I have searched the existing issues

Environment

- Milvus version: 2.3.2
- Deployment mode(standalone or cluster): cluster
- MQ type(rocksmq, pulsar or kafka): kafka
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus==2.3.2
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

When using range search with search param radius , Milvus only returns up to 6400 results even though topk/limit can be up to 16,384

Expected Behavior

I expect to get up to 16384 results, if those results are within the range_filter and radius

Steps To Reproduce

I have a collection with around 1 million records

If I perform a search with no range filter and limit=16384 , I get 16384 results

results = c.search(
    data=[ref_embd],
    #expr="candidate_task_id != ' '",
    anns_field='img',
    param={'params': {}, 'metric_type': 'L2'},
    limit=16384,
    offset=0,
    output_fields=['id']
)[0]

ids = [result.id for result in results]
len(ids)
>> 16384

If I now add radius into params , while keeping limit=16384 , I only get 6400 results.

results = c.search(
    data=[ref_embd],
    #expr="candidate_task_id != ' '",
    anns_field='img',
    param={'params': {'radius': 10}, 'metric_type': 'L2'},
    limit=16384,
    offset=0,
    output_fields=['id']
)[0]

ids = [result.id for result in results]
len(ids)
>> 6400

I have verified that this happens for ALL collections I have.

I have also verified that when I get 16384 records, the largest distance value is < 1.0 , so using radius of 10.0 should still get me back all 16384 results

Another note is search speed is orders of magnitude slower when using range search which I dont understand.

For example when I am using a smaller limit like 30, this same search without radius will run in around 120ms , and when I add radius to params it takes 7 seconds!!



### Milvus Log

_No response_

### Anything else?

_No response_

About this issue

Original URL
State: open
Created 7 months ago
Comments: 25 (11 by maintainers)

Commits related to this issue

enhance: Bump Knowhere's version to 2.2.3 (#29035) Knowhere's new bug fix version: https://github.com/zilliztech/knowhere/releases/tag/v2.2.3 related issues: #28821 #28810 #27552 #27516 #28603 #2148... — committed to milvus-io/milvus by liliu-z 7 months ago

Most upvoted comments

When it get sealed, the reason why it is slow is because you do sequential search upon it. Actually its QPS will increase at the segment number get smaller.

Is there anything we can do for now to improve performance with DiskANN?

Should we expect this PR to help? (Seems like maybe, if anything, it might return the correct number of records? But not necessarily improve performance?)

@xiaofan-luan mentioned HNSW_PQ/SQ, but it seems these are no longer available. Am I understanding that correctly?

Hi @pakelley

Algorithm need to be refactoring and we are working on it. Change this default value into 1.0 can help accelerate it, with a little sacrifice of performance.
This can only help on the number get returned
We removed the old implementation of it and working on a new one. Still need some time

liliu-z on Dec 5, 2023

@liliu-z yep, that’s correct. The example dataset in Hakan’s code (which has 17,760 records) was “fast” 2 days ago, and yesterday became “slow” and would only return 6400 records.

pakelley on Nov 30, 2023

/assign @congqixia could you please also take a look?

yanliang567 on Nov 30, 2023

/assign @jiaoew1991 /unassign

yanliang567 on Nov 29, 2023

@hakan458 which index type are you running? could you please provide the full milvus logs? please refer this doc to export the whole Milvus logs for investigation. Also please attach the etcd backup which would help us understand the “slow issue”. Check this: https://github.com/milvus-io/birdwatcher for details about how to backup etcd with birdwatcher.

/assign @hakan458

yanliang567 on Nov 29, 2023