milvus: [Bug]: [chaos][cluster] Insert data fails when etcd pod recovered running status after been killed
Is there an existing issue for this?
- I have searched the existing issues
Current Behavior
Insert data still fail when etcd pod recovered running status after been killed. error message
Error: <BaseException: (code=1, message=GetSegmentID failed: SegmentIDAllocator failRemainRequest err:syncSegmentID Failed:server is not serving)>
Traceback (most recent call last):
File "hello_milvus.py", line 89, in <module>
hello_milvus()
File "hello_milvus.py", line 47, in hello_milvus
collection.insert(
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/orm/collection.py", line 525, in insert
res = conn.insert(collection_name=self._name, entities=entities, ids=None,
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/client/stub.py", line 61, in handler
raise e
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/client/stub.py", line 45, in handler
return func(self, *args, **kwargs)
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/client/stub.py", line 931, in insert
return handler.bulk_insert(collection_name, entities, partition_name, timeout, **kwargs)
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 65, in handler
raise e
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 57, in handler
return func(self, *args, **kwargs)
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 505, in bulk_insert
raise err
File "/Users/zilliz/opt/anaconda3/lib/python3.8/site-packages/pymilvus/client/grpc_handler.py", line 501, in bulk_insert
raise BaseException(response.status.error_code, response.status.reason)
pymilvus.client.exceptions.BaseException: <BaseException: (code=1, message=GetSegmentID failed: SegmentIDAllocator failRemainRequest err:syncSegmentID Failed:server is not serving)>
Expected Behavior
All operations can work well
Steps To Reproduce
1. Deploy milvus by helm: cd `tests/python_client/chaos` && `helm install --wait --timeout 360s milvus-chaos milvus/milvus -f cluster-values.yaml -n=chaos-testing`
2. Run scripts before chaos: `python helllo_milvus.py`
3. Delete ectd pod: `kubectl delete pod ${pod_name}`
4. Run scripts after chaos: `python hello_milvus.py`
Environment
- Milvus version: d54f342
- Deployment mode(standalone or cluster): cluster
- SDK version(e.g. pymilvus v2.0.0rc2):2.0.0rc8.dev9
- OS(Ubuntu or CentOS):
- CPU/Memory:
- GPU:
- Others:
Anything else?
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 20 (19 by maintainers)
No! The milvus system still can’t work even though etcd has recovered for a long time