milvus: [Bug]: Milvus stops responding after a couple of hours of light load

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: v2.0.2
- Deployment mode(standalone or cluster): cluster
- SDK version(e.g. pymilvus v2.0.0rc2): pymilvus v2.0.2
- OS(Ubuntu or CentOS): Container-Optimized OS with containerd (cos_containerd) ( GCP k8s cluster)
- CPU/Memory: 4 cpu / 27GB memory
- GPU: No
- Others: Running in GCP Kubernetes cluster version 1.20.15-gke.3400

Current Behavior

I’ve loaded approx 1.4M embeddings of size 512 into a new collection and created a RHNSW_PQ index. This is then loaded on to 4 query-nodes and I start to loop: For each embedding, search for 100 candidates of most similar.

This seems to work OK up until after a few hours, where it just hangs on fetching new embeddings. This is the log from 1 of 4 query nodes:

l/querynode.(*QueryNode).ReleaseCollection.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:388\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:392\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:287\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler.func1\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3558\ngithub.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.UnaryServerInterceptor.func1\n\t/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/tracing/opentracing/server_interceptors.go:38\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3560\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1286\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1609\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:934"}
{"level":"INFO","time":"2022/05/06 07:38:42.775 +00:00","caller":"grpclog.go:37","message":"[transport]transport: loopyWriter.run returning. connection error: desc = \"transport is closing\""}
{"level":"WARN","time":"2022/05/06 07:39:31.838 +00:00","caller":"query_service.go:174","message":"stopQueryCollection failed, collection doesn't exist","collectionID":432927636982333441}
{"level":"WARN","time":"2022/05/06 07:39:31.839 +00:00","caller":"task_scheduler.go:63","message":"release collection failed, collectionID = 432927636982333441, err = collection hasn't been loaded or has been released, collection id = 432927636982333441"}
{"level":"ERROR","time":"2022/05/06 07:39:31.839 +00:00","caller":"impl.go:388","message":"release collection failed, collectionID = 432927636982333441, err = collection hasn't been loaded or has been released, collection id = 432927636982333441","stack":"github.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseCollection.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:388\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:392\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:287\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler.func1\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3558\ngithub.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.UnaryServerInterceptor.func1\n\t/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/tracing/opentracing/server_interceptors.go:38\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3560\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1286\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1609\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:934"}
{"level":"INFO","time":"2022/05/06 07:40:22.943 +00:00","caller":"grpclog.go:37","message":"[transport]transport: loopyWriter.run returning. connection error: desc = \"transport is closing\""}
{"level":"WARN","time":"2022/05/06 07:41:55.851 +00:00","caller":"query_service.go:174","message":"stopQueryCollection failed, collection doesn't exist","collectionID":432927636982333441}
{"level":"WARN","time":"2022/05/06 07:41:55.851 +00:00","caller":"task_scheduler.go:63","message":"release collection failed, collectionID = 432927636982333441, err = collection hasn't been loaded or has been released, collection id = 432927636982333441"}
{"level":"ERROR","time":"2022/05/06 07:41:55.851 +00:00","caller":"impl.go:388","message":"release collection failed, collectionID = 432927636982333441, err = collection hasn't been loaded or has been released, collection id = 432927636982333441","stack":"github.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseCollection.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:388\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:392\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:287\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler.func1\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3558\ngithub.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.UnaryServerInterceptor.func1\n\t/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/tracing/opentracing/server_interceptors.go:38\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3560\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1286\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1609\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:934"}
{"level":"INFO","time":"2022/05/06 07:42:46.957 +00:00","caller":"grpclog.go:37","message":"[transport]transport: loopyWriter.run returning. connection error: desc = \"transport is closing\""}
{"level":"WARN","time":"2022/05/06 07:45:33.879 +00:00","caller":"query_service.go:174","message":"stopQueryCollection failed, collection doesn't exist","collectionID":432927636982333441}
{"level":"WARN","time":"2022/05/06 07:45:33.879 +00:00","caller":"task_scheduler.go:63","message":"release collection failed, collectionID = 432927636982333441, err = collection hasn't been loaded or has been released, collection id = 432927636982333441"}
{"level":"ERROR","time":"2022/05/06 07:45:33.879 +00:00","caller":"impl.go:388","message":"release collection failed, collectionID = 432927636982333441, err = collection hasn't been loaded or has been released, collection id = 432927636982333441","stack":"github.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseCollection.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:388\ngithub.com/milvus-io/milvus/internal/querynode.(*QueryNode).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/querynode/impl.go:392\ngithub.com/milvus-io/milvus/internal/distributed/querynode.(*Server).ReleaseCollection\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querynode/service.go:287\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler.func1\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3558\ngithub.com/grpc-ecosystem/go-grpc-middleware/tracing/opentracing.UnaryServerInterceptor.func1\n\t/go/pkg/mod/github.com/grpc-ecosystem/go-grpc-middleware@v1.3.0/tracing/opentracing/server_interceptors.go:38\ngithub.com/milvus-io/milvus/internal/proto/querypb._QueryNode_ReleaseCollection_Handler\n\t/go/src/github.com/milvus-io/milvus/internal/proto/querypb/query_coord.pb.go:3560\ngoogle.golang.org/grpc.(*Server).processUnaryRPC\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1286\ngoogle.golang.org/grpc.(*Server).handleStream\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:1609\ngoogle.golang.org/grpc.(*Server).serveStreams.func1.2\n\t/go/pkg/mod/google.golang.org/grpc@v1.38.0/server.go:934"}

I don’t know what to do when this happens, but I could also see from the index-coordinator that it errored with

{"level":"INFO","time":"2022/05/06 07:28:38.623 +00:00","caller":"grpclog.go:37","message":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0004e0e80, {TRANSIENT_FAILURE connection error: desc = \"transport: Error while dialing dial tcp 10.48.10.6:21121: i/o timeout\"}"}

and there is no pod running with IP 10.48.10.6, so I thought I would try to restart all the coordinators. When I did this, the querycoord got stuck in a crash loop:

time="2022-05-06T08:28:05Z" level=error msg="[Failed to create producer]" error="server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded" topic="persistent://public/default/milvus-search-0"
panic: Failed to create producer milvus-search-0, error = All attempts results:
attempt #1:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #2:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #3:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #4:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #5:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #6:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #7:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #8:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #9:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #10:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded
attempt #11:server error: ProducerBlockedQuotaExceededException: Cannot create producer on topic with backlog quota exceeded


goroutine 190 [running]:
github.com/milvus-io/milvus/internal/msgstream.(*mqMsgStream).AsProducer(0xc0007f0680, 0xc00063a090, 0x1, 0x1)
	/go/src/github.com/milvus-io/milvus/internal/msgstream/mq_msgstream.go:124 +0x278
github.com/milvus-io/milvus/internal/querycoord.(*MetaReplica).getQueryStreamByID(0xc0007140f0, 0x60211645fc80001, 0xc0001563f0, 0xf, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/milvus-io/milvus/internal/querycoord/meta.go:869 +0x3e2
github.com/milvus-io/milvus/internal/querycoord.(*MetaReplica).sendSealedSegmentChangeInfos(0xc0007140f0, 0x60211645fc80001, 0xc0001563f0, 0xf, 0xc000848f80, 0xc0005b2700, 0x10, 0x20)
	/go/src/github.com/milvus-io/milvus/internal/querycoord/meta.go:641 +0x6a
github.com/milvus-io/milvus/internal/querycoord.(*MetaReplica).removeGlobalSealedSegInfos(0xc0007140f0, 0x60211645fc80001, 0x0, 0x0, 0x0, 0x40, 0xc000565400, 0x203000)
	/go/src/github.com/milvus-io/milvus/internal/querycoord/meta.go:583 +0x332
github.com/milvus-io/milvus/internal/querycoord.updateSegmentInfoFromTask(0x36693d0, 0xc0007cc080, 0x3692720, 0xc0007cc800, 0x36913f8, 0xc0007140f0, 0x0, 0x0)
	/go/src/github.com/milvus-io/milvus/internal/querycoord/task_scheduler.go:891 +0xacc
github.com/milvus-io/milvus/internal/querycoord.(*TaskScheduler).scheduleLoop(0xc00067cf80)
	/go/src/github.com/milvus-io/milvus/internal/querycoord/task_scheduler.go:657 +0xd73
created by github.com/milvus-io/milvus/internal/querycoord.(*TaskScheduler).Start
	/go/src/github.com/milvus-io/milvus/internal/querycoord/task_scheduler.go:867 +0x65

and won’t start back up.

Expected Behavior

I expect Milvus to not hang when I try to fetch embeddings, either via the primary key nor via search. I expect the Milvus cluster to recover if any of the pods gets re-scheduled in a k8s cluster since pods are ephemeral and can for sure get restarted at any time.

Steps To Reproduce

No response

Milvus Log

No response

Anything else?

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (11 by maintainers)

Most upvoted comments

The main reason is we didn’t unregister search channel and search channel backlog is full.

In 2.1 we remove the search channel so there should not exist similar issues. I want to fix this issue in 2.0.3 also and this should be released this week

Hi friends! when are you planning to release a fixed version for this bug? It I am having the same problem 😕

It seems that 2.0.2 is not production stable, while I can see movement in 2.1- rc.

@xiaofan-luan

You can try to use this command:

bin/pulsar-admin topics clear-backlog public/default/milvus-search-0 -s <subscription-name>

You can find the subscription name in the topics stats output, such as milvus-queryNode-432927636982333441-11

I’ve now been able to run clear-backlog on all 4 subscriptions found under the topic persistent://public/default/milvus-search-0:

./bin/pulsar-admin topics clear-backlog persistent://public/default/milvus-search-0 -s milvus-queryNode-432927636982333441-11
./bin/pulsar-admin topics clear-backlog persistent://public/default/milvus-search-0 -s milvus-queryNode-432927636982333441-10
./bin/pulsar-admin topics clear-backlog persistent://public/default/milvus-search-0 -s milvus-queryNode-432927636982333441-9
./bin/pulsar-admin topics clear-backlog persistent://public/default/milvus-search-0 -s milvus-queryNode-432927636982333441-8

Then I deleted the querycoord pod so that it restarts and now it starts up with the following logs

2022/05/06 09:46:35 maxprocs: Leaving GOMAXPROCS=4: CPU quota undefined

    __  _________ _   ____  ______
   /  |/  /  _/ /| | / / / / / __/
  / /|_/ // // /_| |/ / /_/ /\ \
 /_/  /_/___/____/___/\____/___/

Welcome to use Milvus!
Version:   v2.0.2
Built:     Sat Apr  2 03:29:24 UTC 2022
GitCommit: 898533c5
GoVersion: go version go1.16.9 linux/amd64

open pid file: /run/milvus/querycoord.pid
lock pid file: /run/milvus/querycoord.pid
[2022/05/06 09:46:35.128 +00:00] [DEBUG] [base_table.go:78] ["config directory"] [configDir=/milvus/configs/]
[2022/05/06 09:46:35.130 +00:00] [DEBUG] [component_param.go:592] ["init cacheSize"] ["cacheSize (GB)"=0]
[2022/05/06 09:46:35.131 +00:00] [DEBUG] [component_param.go:624] ["query node init gracefulTime"] [gracefulTime=0]
[2022/05/06 09:46:35.131 +00:00] [DEBUG] [query_coord.go:302] ["query coordinator"] [queryChannels="[{}]"]
[2022/05/06 09:46:35.131 +00:00] [DEBUG] [base_table.go:78] ["config directory"] [configDir=/milvus/configs/]
[2022/05/06 09:46:35.134 +00:00] [DEBUG] [grpc_param.go:118] [initServerMaxSendSize] [role=querycoord] [grpc.serverMaxSendSize=2147483647]
[2022/05/06 09:46:35.134 +00:00] [DEBUG] [grpc_param.go:141] [initServerMaxRecvSize] [role=querycoord] [grpc.serverMaxRecvSize=2147483647]
[2022/05/06 09:46:35.136 +00:00] [DEBUG] [service.go:244] [network] [port=19531]
[2022/05/06 09:46:35.237 +00:00] [DEBUG] [session_util.go:86] ["Session try to connect to etcd"]
[2022/05/06 09:46:35.241 +00:00] [DEBUG] [session_util.go:101] ["Session connect to etcd success"]
[2022/05/06 09:46:35.241 +00:00] [DEBUG] [base_table.go:78] ["config directory"] [configDir=/milvus/configs/]
[2022/05/06 09:46:35.243 +00:00] [DEBUG] [grpc_param.go:186] [initClientMaxSendSize] [role=rootcoord] [grpc.clientMaxSendSize=104857600]
[2022/05/06 09:46:35.243 +00:00] [DEBUG] [grpc_param.go:209] [initClientMaxRecvSize] [role=rootcoord] [grpc.clientMaxRecvSize=104857600]
[2022/05/06 09:46:35.243 +00:00] [DEBUG] [service.go:153] ["QueryCoord try to wait for RootCoord ready"]
[2022/05/06 09:46:35.246 +00:00] [DEBUG] [session_util.go:294] ["SessionUtil GetSessions "] [prefix=rootcoord] [key=rootcoord] [address=10.48.39.18:53100]
[2022/05/06 09:46:35.246 +00:00] [DEBUG] [client.go:96] ["RootCoordClient GetSessions success"] [address=10.48.39.18:53100]
[2022/05/06 09:46:35.249 +00:00] [DEBUG] [service.go:163] ["QueryCoord report RootCoord ready"]
[2022/05/06 09:46:35.249 +00:00] [DEBUG] [session_util.go:86] ["Session try to connect to etcd"]
[2022/05/06 09:46:35.252 +00:00] [DEBUG] [session_util.go:101] ["Session connect to etcd success"]
[2022/05/06 09:46:35.252 +00:00] [DEBUG] [base_table.go:78] ["config directory"] [configDir=/milvus/configs/]
[2022/05/06 09:46:35.253 +00:00] [DEBUG] [grpc_param.go:186] [initClientMaxSendSize] [role=datacoord] [grpc.clientMaxSendSize=104857600]
[2022/05/06 09:46:35.253 +00:00] [DEBUG] [grpc_param.go:209] [initClientMaxRecvSize] [role=datacoord] [grpc.clientMaxRecvSize=104857600]
[2022/05/06 09:46:35.253 +00:00] [DEBUG] [service.go:182] ["QueryCoord try to wait for DataCoord ready"]
[2022/05/06 09:46:35.256 +00:00] [DEBUG] [session_util.go:294] ["SessionUtil GetSessions "] [prefix=datacoord] [key=datacoord] [address=10.48.14.6:13333]
[2022/05/06 09:46:35.259 +00:00] [DEBUG] [service.go:191] ["QueryCoord report DataCoord ready"]
[2022/05/06 09:46:35.259 +00:00] [DEBUG] [session_util.go:86] ["Session try to connect to etcd"]
[2022/05/06 09:46:35.261 +00:00] [DEBUG] [session_util.go:101] ["Session connect to etcd success"]
[2022/05/06 09:46:35.261 +00:00] [DEBUG] [base_table.go:78] ["config directory"] [configDir=/milvus/configs/]
[2022/05/06 09:46:35.262 +00:00] [DEBUG] [grpc_param.go:186] [initClientMaxSendSize] [role=indexcoord] [grpc.clientMaxSendSize=104857600]
[2022/05/06 09:46:35.262 +00:00] [DEBUG] [grpc_param.go:209] [initClientMaxRecvSize] [role=indexcoord] [grpc.clientMaxRecvSize=104857600]
[2022/05/06 09:46:35.263 +00:00] [DEBUG] [service.go:212] ["QueryCoord try to wait for IndexCoord ready"]
[2022/05/06 09:46:35.264 +00:00] [DEBUG] [session_util.go:294] ["SessionUtil GetSessions "] [prefix=indexcoord] [key=indexcoord] [address=10.48.14.5:31000]
[2022/05/06 09:46:35.264 +00:00] [DEBUG] [client.go:97] ["IndexCoordClient GetSessions success"] [key=indexcoord] [msess="{\"indexcoord\":{\"ServerID\":49,\"ServerName\":\"indexcoord\",\"Address\":\"10.48.14.5:31000\",\"Exclusive\":true,\"TriggerKill\":true}}"]
[2022/05/06 09:46:35.268 +00:00] [DEBUG] [service.go:218] ["QueryCoord report IndexCoord is ready"]
[2022/05/06 09:46:35.268 +00:00] [DEBUG] [service.go:225] [QueryCoord] [State=Initializing]
[2022/05/06 09:46:35.268 +00:00] [DEBUG] [query_coord.go:132] ["query coordinator start init, session info"] [metaPath=milvus/meta] [address=10.48.39.19:19531]
[2022/05/06 09:46:35.269 +00:00] [DEBUG] [session_util.go:86] ["Session try to connect to etcd"]
[2022/05/06 09:46:35.270 +00:00] [DEBUG] [session_util.go:101] ["Session connect to etcd success"]
[2022/05/06 09:46:35.279 +00:00] [DEBUG] [session_util.go:175] ["Session get serverID success"] [key=id] [ServerId=82]
[2022/05/06 09:46:35.279 +00:00] [INFO] [base_table.go:425] ["Set log file to "] [path=]
{"level":"INFO","time":"2022/05/06 09:46:35.284 +00:00","caller":"tso.go:121","message":"sync and save timestamp","last":"2022/05/06 09:36:34.503 +00:00","save":"2022/05/06 09:46:38.281 +00:00","next":"2022/05/06 09:46:35.281 +00:00"}
{"level":"ERROR","time":"2022/05/06 09:46:35.316 +00:00","caller":"task_scheduler.go:236","message":"reloadFromKV: taskStateInfo and triggerTaskInfo are inconsistent","stack":"github.com/milvus-io/milvus/internal/querycoord.(*TaskScheduler).reloadFromKV\n\t/go/src/github.com/milvus-io/milvus/internal/querycoord/task_scheduler.go:236\ngithub.com/milvus-io/milvus/internal/querycoord.newTaskScheduler\n\t/go/src/github.com/milvus-io/milvus/internal/querycoord/task_scheduler.go:174\ngithub.com/milvus-io/milvus/internal/querycoord.(*QueryCoord).Init.func1\n\t/go/src/github.com/milvus-io/milvus/internal/querycoord/query_coord.go:187\nsync.(*Once).doSlow\n\t/usr/local/go/src/sync/once.go:68\nsync.(*Once).Do\n\t/usr/local/go/src/sync/once.go:59\ngithub.com/milvus-io/milvus/internal/querycoord.(*QueryCoord).Init\n\t/go/src/github.com/milvus-io/milvus/internal/querycoord/query_coord.go:134\ngithub.com/milvus-io/milvus/internal/distributed/querycoord.(*Server).init\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/service.go:226\ngithub.com/milvus-io/milvus/internal/distributed/querycoord.(*Server).Run\n\t/go/src/github.com/milvus-io/milvus/internal/distributed/querycoord/service.go:95\ngithub.com/milvus-io/milvus/cmd/components.(*QueryCoord).Run\n\t/go/src/github.com/milvus-io/milvus/cmd/components/query_coord.go:50\ngithub.com/milvus-io/milvus/cmd/roles.(*MilvusRoles).runQueryCoord.func1\n\t/go/src/github.com/milvus-io/milvus/cmd/roles/roles.go:178"}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]parsed scheme: \"\""}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]scheme \"\" not registered, fallback to default scheme"}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]ccResolverWrapper: sending update to cc: {[{10.48.86.3:21123  <nil> 0 <nil>}] <nil> <nil>}"}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]ClientConn switching balancer to \"pick_first\""}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]Channel switches to new LB policy \"pick_first\""}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]Subchannel Connectivity change to CONNECTING"}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]Subchannel picks a new address \"10.48.86.3:21123\" to connect"}
{"level":"INFO","time":"2022/05/06 09:47:35.320 +00:00","caller":"grpclog.go:37","message":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0009129c0, {CONNECTING <nil>}"}
{"level":"INFO","time":"2022/05/06 09:47:35.321 +00:00","caller":"grpclog.go:37","message":"[core]Channel Connectivity change to CONNECTING"}
{"level":"INFO","time":"2022/05/06 09:47:35.322 +00:00","caller":"grpclog.go:37","message":"[core]Subchannel Connectivity change to READY"}
{"level":"INFO","time":"2022/05/06 09:47:35.323 +00:00","caller":"grpclog.go:37","message":"[core]pickfirstBalancer: UpdateSubConnState: 0xc0009129c0, {READY <nil>}"}
{"level":"INFO","time":"2022/05/06 09:47:35.323 +00:00","caller":"grpclog.go:37","message":"[core]Channel Connectivity change to READY"}
{"level":"WARN","time":"2022/05/06 09:47:35.328 +00:00","caller":"query_coord.go:532","message":"loadBalanceSegmentLoop: there are too few available query nodes to balance","onlineNodeIDs":[44],"availableNodeIDs":[44]}

But it seems to start up “ok” anyways, k8s reports it as ready!