milvus: [Bug]: The specified key does not exist.

Is there an existing issue for this?

I have searched the existing issues

Environment

- Milvus version: v2.0ga
- Deployment mode(standalone or cluster):cluster
- SDK version(e.g. pymilvus v2.0.0rc2):milvus-sdk-go v2
- OS(Ubuntu or CentOS): Ubuntu
- CPU/Memory: 48/384G
- GPU: V100
- Others:

Current Behavior

[2022/02/21 06:42:40.940 +00:00] [ERROR] [impl.go:342] [“The specified key does not exist.”] [stack=“github.com/milvus-io/milvus/internal/querynode.(*QueryNode).LoadSegments.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:342\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).LoadSegments\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:351\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).LoadSegments\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:356\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_LoadSegments_Handler.func1\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3456\ngithub.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.UnaryServerInterceptor.func1\n\t/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/tracing/opentracing/server_interceptors.go:38\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_LoadSegments_Handler\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3458\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1286\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1609\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:934”]

Expected Behavior

Querynode automatically loads after restart querynode_log query_node.log

Steps To Reproduce

No response

Anything else?

No response

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 30 (21 by maintainers)

Most upvoted comments

seems like #12465 , can you paste your configuration about DataCoord here ?

This is the log of the datacoord node. Thank you for your help Uploading datacoord.log…

thanks， and i want your config file.

OK, here is our configuration file

dataCoordinator:
  enabled: true
  replicas: 1           # Run Data Coordinator mode with replication disabled
  resources: {}
  nodeSelector: {}
  affinity: {}
  tolerations: []
  extraEnv:
  - name: GODEBUG
    value: "madvdontneed=1"

  enableCompaction: true
  enableGarbageCollection: true

  segment:
    maxSize: 512  # Maximum size of a segment in MB

  compaction:
    enableAutoCompaction: true
    retentionDuration: 432000  # 5 days in seconds

  gc:
    interval: 60   # gc interval in seconds
    missingTolerance: 3600  # file meta missing tolerance duration in seconds, 1 day
    dropTolerance: 3600  # file belongs to dropped entity tolerance duration in seconds, 1 day


  service:
    port: 13333
    annotations: {}
    labels: {}
    clusterIP: ""

dataNode:
  enabled: true
  replicas: 3
  resources: {}
  nodeSelector: {}
  affinity: {}
  tolerations: []
  extraEnv:
  - name: GODEBUG
    value: "madvdontneed=1"
  flush:
    insertBufSize: "16777216"  ## bytes, 16MB
  autoscaling:
    enabled: false
    minReplicas: 2
    maxReplicas: 5
    metrics:
      - type: Resource
        resource:
          name: cpu
          target:
            type: Utilization
            averageUtilization: 60
      - type: Resource
        resource:
          name: memory
          target:
            type: Utilization
            averageUtilization: 60

values.yaml.log

datacoord.dropTolerance is set too small, only one hour. , according to our 1 billion data benchmark test, it will take about an hour to load 170 million data. If compaction occurs during load, and the segment before compaction is dropped before loading into querynode memory, load will fail

xige-16 on Mar 4, 2022