milvus: [Bug]: Query performance is very low (below 150qps) despite important query node number

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.2.2
- Deployment mode(standalone or cluster): cluster AWS 
- MQ type(rocksmq, pulsar or kafka):  pulsar  
- SDK version(e.g. pymilvus v2.0.0rc2): github.com/milvus-io/milvus-sdk-go/v2 v2.2.0
- OS(Ubuntu or CentOS): CentOS
- CPU/Memory: c5.4xlarge => 16 VCPU, 32GB
- GPU: 
- Others:

Current Behavior

Querying on a new sending node (other AWS server) and doing some random benchmark I see very low throughput.

I’m doing 2 machine (1 local + 1 aws server) 4 processes, batches of 1000query with 10 go-routines. Each go routines as a single connection to a single Proxy node (using 2 proxy nodes randomly).

Here’s my collection info:

  • name: embedding
  • FloatVector: 384
  • _default_idx: IVF_FLAT
  • metric_type: IP nlist:1800
  • Approx Entity Count :11,032,326
  • Collection is loaded and replicated on 30 query nodes.
  • Collection gains about 3k-4k vectors/second (through another process inserting new records)

Collection ID: 438693529927026545 Collection Name: mycollection Partitions: Fields:

  • Field ID: 0 Field Name: RowID Field Type: Int64
  • Field ID: 1 Field Name: Timestamp Field Type: Int64
  • Field ID: 100 Field Name: entity_id Field Type: Int64 - Primary Key, AutoID: false
  • Field ID: 101 Field Name: text_count Field Type: Int16
  • Field ID: 102 Field Name: hash_count Field Type: Int8
  • Field ID: 103 Field Name: other_count Field Type: Int8
  • Field ID: 104 Field Name: mention_count Field Type: Int8
  • Field ID: 105 Field Name: media_count Field Type: Int8
  • Field ID: 106 Field Name: is_bool Field Type: Bool
  • Field ID: 107 Field Name: has_quote Field Type: Bool
  • Field ID: 108 Field Name: creation_timestamp Field Type: Int32
  • Field ID: 109 Field Name: embedding Field Type: FloatVector - Type Param dim: 384 Consistency Level: Bounded Start position for channel by-dev-rootcoord-dml_32: [8 42 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_35: [8 45 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_56: [8 66 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_30: [8 40 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_47: [8 57 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_48: [8 58 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_51: [8 61 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_52: [8 62 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_34: [8 44 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_38: [8 48 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_42: [8 52 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_43: [8 53 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_44: [8 54 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_55: [8 65 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_37: [8 47 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_46: [8 56 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_50: [8 60 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_53: [8 63 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_54: [8 64 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_41: [8 51 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_40: [8 50 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_45: [8 55 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_49: [8 59 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_59: [8 69 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_36: [8 46 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_39: [8 49 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_57: [8 67 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_31: [8 41 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_58: [8 68 16 0 24 0 32 0] Start position for channel by-dev-rootcoord-dml_33: [8 43 16 0 24 0 32 0]

here’s the query:

        vec2search := []entity.Vector{searchParams.vector}
	start := time.Now()

	sp, _ := entity.NewIndexIvfFlatSearchParam(1)
	sRet, err := milvusClient.Search(ctx, "mycollection", nil, "", []string{"entity_id"},
		vec2search, "embedding", entity.IP, 20, sp,
		client.WithSearchQueryConsistencyLevel(entity.ClBounded), client.WithLimit(20))
	if err != nil {
		fmt.Println(err)
		return false, 0, 0, fmt.Errorf("failed to search collection, err: %v", err)
	}
image

Process 1:

Queried nbSearch: 1000, nbError: 0 records in 29.460374951s (total:4m53.267963685s) => 33.94 qps (avg latency: 293.27ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 29.839000831s (total:4m55.789648876s) => 33.51 qps (avg latency: 295.79ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 30.076635863s (total:4m59.234612841s) => 33.25 qps (avg latency: 299.23ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 29.815182548s (total:4m56.846613236s) => 33.54 qps (avg latency: 296.85ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 30.022772186s (total:4m58.777589485s) => 33.31 qps (avg latency: 298.78ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 29.117218885s (total:4m49.991001462s) => 34.34 qps (avg latency: 289.99ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 27.628981886s (total:4m34.572110472s) => 36.20 qps (avg latency: 274.57ms, avg results: 20.00)

Process 2

Queried nbSearch: 1000, nbError: 0 records in 27.778894894s (total:4m36.510751041s) => 36.00 qps (avg latency: 276.51ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 27.79649825s (total:4m36.765207525s) => 35.98 qps (avg latency: 276.76ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 27.916158123s (total:4m38.357692297s) => 35.82 qps (avg latency: 278.36ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 27.888368544s (total:4m37.913993133s) => 35.86 qps (avg latency: 277.91ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 28.066101082s (total:4m39.291657502s) => 35.63 qps (avg latency: 279.29ms, avg results: 20.00) 

Process 3

Queried nbSearch: 1000, nbError: 0 records in 29.68494171s (total:4m55.683594435s) => 33.69 qps (avg latency: 295.68ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 29.485262268s (total:4m53.881224462s) => 33.92 qps (avg latency: 293.88ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 28.993143544s (total:4m48.817353677s) => 34.49 qps (avg latency: 288.82ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 29.305538861s (total:4m51.002148977s) => 34.12 qps (avg latency: 291.00ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 29.322604115s (total:4m50.888327916s) => 34.10 qps (avg latency: 290.89ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 29.495849313s (total:4m53.244181178s) => 33.90 qps (avg latency: 293.24ms, avg results: 20.00) 

Process 4

Queried nbSearch: 1000, nbError: 0 records in 1m14.986758708s (total:6m13.327321304s) => 13.34 qps (avg latency: 373.33ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 1m17.293226959s (total:6m24.871092199s) => 12.94 qps (avg latency: 384.87ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 1m16.322780041s (total:6m19.836449932s) => 13.10 qps (avg latency: 379.84ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 1m16.321409708s (total:6m19.74849057s) => 13.10 qps (avg latency: 379.75ms, avg results: 20.00) 
Queried nbSearch: 1000, nbError: 0 records in 1m17.547682833s (total:6m25.682184784s) => 12.90 qps (avg latency: 385.68ms, avg results: 20.00) 

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 17 (9 by maintainers)

Most upvoted comments

/assign @liliu-z pls investigate on this issue when you have time