milvus: [Bug]: Milvus crashed with panic when enabling compaction and GC
Is there an existing issue for this?
- I have searched the existing issues
Environment
- Milvus version: latest
- Deployment mode(standalone or cluster): standalone
- SDK version(e.g. pymilvus v2.0.0rc2): latest
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Current Behavior
Milvus is crashed with panic
d-dml_234_429327338027614209v0 is not watched on node 7\nattempt #2:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #3:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #4:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #5:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #6:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #7:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #8:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #9:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\nattempt #10:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7\n"]
2021-11-24T11:37:32.426529677Z stderr F panic: All attempts results:
2021-11-24T11:37:32.426567598Z stderr F attempt #1:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426572509Z stderr F attempt #2:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426574623Z stderr F attempt #3:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426594442Z stderr F attempt #4:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.42659899Z stderr F attempt #5:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426602122Z stderr F attempt #6:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426604116Z stderr F attempt #7:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426606019Z stderr F attempt #8:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426624374Z stderr F attempt #9:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426628487Z stderr F attempt #10:data service save bin log path failed, reason = channel by-dev-rootcoord-dml_234_429327338027614209v0 is not watched on node 7
2021-11-24T11:37:32.426630482Z stderr F
2021-11-24T11:37:32.426633033Z stderr F
2021-11-24T11:37:32.426634914Z stderr F goroutine 247170 [running]:
2021-11-24T11:37:32.426688175Z stderr F github.com/milvus-io/milvus/internal/datanode.flushNotifyFunc.func1(0xc001cf6690)
2021-11-24T11:37:32.426695234Z stderr F /go/src/github.com/milvus-io/milvus/internal/datanode/flush_manager.go:509 +0x1439
2021-11-24T11:37:32.426697399Z stderr F github.com/milvus-io/milvus/internal/datanode.(*flushTaskRunner).waitFinish(0xc006768600, 0xc00a7c9020, 0xc00c779c90)
2021-11-24T11:37:32.426719453Z stderr F /go/src/github.com/milvus-io/milvus/internal/datanode/flush_task.go:186 +0xbd
2021-11-24T11:37:32.426759007Z stderr F created by github.com/milvus-io/milvus/internal/datanode.(*flushTaskRunner).init.func1
2021-11-24T11:37:32.426763191Z stderr F /go/src/github.com/milvus-io/milvus/internal/datanode/flush_task.go:118 +0xb0
Expected Behavior
No panic
Steps To Reproduce
1 Start milvus with compaction and GC enabled:
--set dataCoordinator.enableCompaction="true" \
--set dataCoordinator.enableGarbageCollection="true" \
--set dataCoordinator.gc.interval=60 \
--set dataCoordinator.gc.missingTolerance=60 \
--set dataCoordinator.gc.dropTolerance=60 \
2 Runnig CI test cases
Anything else?
No response
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 20 (20 by maintainers)
@binbinlv Waiting for the final PR