backstage: Provider-ingested entities not being eager deleted

Expected Behavior

We expect provider-ingested Entities to be eager deleted.

According to the docs

Entity providers - not processors - are subject to eager deletion of entities…When a provider issues a deletion of an entity in its bucket, that entity as well as the entire tree of entities processed out of it, if any, are considered for immediate deletion. Note “considered” - they are deleted if and only if they would otherwise have become orphaned (no other parent entities emitting them).

Actual Behavior

We are seeing a mix of behavior.

Entities are marked as orphaned but not eager deleted
Entities are not being marked as orphaned despite the provider no longer ingesting them (they are removed from the external source)

DB Behavior

For both behaviors above, we noticed the refresh_state_references table have 0 entries for the problem entities (whether as the source_entity_ref or target_entity_ref).

Note (maybe unrelated): The refresh_state_references table has multiple entries for provider-ingested entities we expect to be neither orphaned nor deleted. Example:

id	source_key	source_entity_ref	target_entity_ref
441616	LdapOrgEntityProvider:okta-ldap	NULL	user:default/jon-doe
499381	LdapOrgEntityProvider:okta-ldap	NULL	user:default/jon-doe

Steps to Reproduce

We are unable to reproduce this locally.

Locally, we created a custom EntityProvider in an attempt to reproduce. The expected behavior occurred.

Context

Despite using the LdapOrgEntityProvider, we have 500+ orphaned user entities. In contrast, we have user entities that should also be eager deleted AND are not flagged as orphan.

Also, we have seen the same behavior from a custom EntityProvider used in production.

We did not notice this erroneous behavior until after our upgrade to Backstage 1.4.0; not sure if this is a coincidence. We are positive that some months ago eager deletion of provider-ingested entities was occurring as expected.

Your Environment

Backstage version: 1.4.0 Postgres version: 13.3

About this issue

Original URL
State: closed
Created 2 years ago
Comments: 20 (20 by maintainers)

Most upvoted comments

Potential fix now over at #15146 instead.

Rugvip on Dec 12, 2022

Got a failing test for this over at #15111, so it’s a fairly obvious issue and easy to reproduce. Haven’t dug deep into what fix makes sense for it though, I have a feeling that switching over to a unique constraint might do it, but not sure what kind of impact that would have.

Rugvip on Dec 8, 2022

The LdapOrgEntityProvider (and our custom provider which ingests a different type of entity) are scheduled using env.scheduler.createScheduledTaskRunner()

return LdapOrgEntityProvider.fromConfig(env.config, {
    id: 'okta-ldap',
    target: ldapTarget,
    logger: env.logger,
    schedule: env.scheduler.createScheduledTaskRunner({
      frequency: { minutes: 15 },
      timeout: { minutes: 5 },
    })

The backend is running on a single pod in a k8s environment, implying there is one host

GoFightNguyen on Aug 8, 2022