milvus: [Bug]: Deleted entities could still be searched

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: PR-20568-20221114-16bf42dae (2.2 branch)
- Deployment mode(standalone or cluster):standalone
- SDK version(e.g. pymilvus v2.0.0rc2):2.2.0.dev71
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

Deleted entities could still be searched https://jenkins.milvus.io:18080/blue/organizations/jenkins/Milvus HA CI/detail/PR-20568/1/pipeline/

Expected Behavior

Deleted entities could not be searched

Steps To Reproduce

def test_delete_search_with_string(self):
        """
        target: test delete and search
        method: search entities after it was deleted, string field is primary
        expected: deleted entity is not in the search result
        """
        # init collection with nb default data
        collection_w, _, _, ids = self.init_collection_general(prefix, insert_data=True,
                                                               primary_field=ct.default_string_field_name)[0:4]
        entity, _ = collection_w.query(default_string_expr, output_fields=["%"])
        search_res, _ = collection_w.search([entity[0][ct.default_float_vec_field_name]],
                                            ct.default_float_vec_field_name,
                                            ct.default_search_params, ct.default_limit)
        # assert search results contains entity
        assert "0" in search_res[0].ids

        expr = f'{ct.default_string_field_name} in {ids[:ct.default_nb // 2]}'
        expr = expr.replace("'", "\"")
        collection_w.delete(expr)
        search_res_2, _ = collection_w.search([entity[0][ct.default_float_vec_field_name]],
                                              ct.default_float_vec_field_name,
                                              ct.default_search_params, ct.default_limit)
        # assert search result is not equal to entity
        log.debug(f"Second search result ids: {search_res_2[0].ids}")
        inter = set(ids[:ct.default_nb // 2]
                    ).intersection(set(search_res_2[0].ids))
        # Using bounded staleness, we could still search the "deleted" entities,
        # since the search requests arrived query nodes earlier than query nodes consume the delete requests.
        assert len(inter) == 0

Milvus Log

artifacts-milvus-standalone-PR-20568-1-pymilvus-e2e-logs.tar.gz

collection name: delete_kWtVotoB

Anything else?

No response

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (20 by maintainers)

Most upvoted comments

if it’s bounded, delete will disappeared after certain seconds and we don’t guarantee data to be deleted immediately.

could we try to refetch after bounded window? by default is 3 seconds.