milvus: [Bug]: Suspected data loss during batch ingestion.
Is there an existing issue for this?
- I have searched the existing issues
Environment
- Milvus version: milvusdb/milvus-nightly
tag: nightly-20230205-ae305a5
- Deployment mode: cluster
- MQ type - pulsar
- SDK version - Java sdk 2.2.0
- OS(Ubuntu or CentOS): Centos
- CPU/Memory: Based on sizing tools
- GPU:
- Others: Collection with 3 fields - id, float vector of 384 dim, varchar field,
There are around 455 Partitions in the collection
Data cord - 1 core/2GB,
Data Node - 4 instances of 2 core/16GB,
Number of shards - 4,
proxy - 2 core/8GB
Current Behavior
We are doing a bulk ingestion of data to our milvus 2.2.x cluster, our data nodes, data cooord, query coord, proxy keeps crashing. And post the ingestion the entity count shown on attu is 7.8M , where as expected count was supposed to be 21.8M.
Expected Behavior
No response
Steps To Reproduce
1. create a collection
2. create 455 partitions to the collection
3. insert 1 ~ 5 rows into each partition randomly
4. datanode crashed occasionally, with 16GB mem. change to 20GB, no crash anymore
5. totally insert 24M rows, but num_entities only return 7M, even after called flush()
Milvus Log
Anything else?
No response
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 21 (17 by maintainers)
@MrPresent-Han, @xiaofan-luan Please find the etcd backup using birdwatcher bw_etcd_ALL.230212-223040.bak.gz