backstage: ๐Ÿ› Bug Report: Github Discovery Catalogs via Wildcard config

๐Ÿ“œ Description

Performance has been inconsistent between manual catalog upload via /catalog-import and Github Discovery feature with Wild Card entry.

I have a file with 48 OpenAPI catalogues injected into it. When performing import on the WebUI, it validated and imported successfully.

However, we have automated processes in the background which involves auto-gen catalog-info files and when Backstage background process performs catalog discovery, I get an error and thus failed to Add/Update the components.

I noticed the github URL path that its trying to call to retrieve the artifact is different from what is defined in the $text.

FYI: Iโ€™m currently running on 2 instances of Backstage in my environment.

๐Ÿ‘ Expected behavior

Catalog added/updated successfully.

๐Ÿ‘Ž Actual Behavior with Screenshots

Console log

"[2m2023-02-03T02:00:46.189Z[22m [34mcatalog[39m [33mwarn[39m Processor PlaceholderProcessor threw an error while preprocessing; caused by Error: Placeholder $text could not read location https://github.com/org/file.yaml, Error: Incorrect URL: https://github.com/org//tree/-/specs/file.yaml, Error: Invalid GitHub URL or file path [36mtype[39m=plugin [36mentity[39m=api:default/<api_name>"

๐Ÿ‘Ÿ Reproduction steps

Have a yaml file containing over 40 catalogs. (Note: We have generated over 50 catalogs but noticed that it would result in timeout, so we capped it at 50)

Config

wildcardProviderId:
        organization: '<orgname>'
        catalogPath: '/backstage/**/*.yaml'

Sample Catalog

...
spec:
  system: <system_name>
  owner: <team_name>
  type: openapi
  lifecycle: experimental
  definition:
    $text: https://github.com/org/repo/blob/main/specs/file.yaml

Github Discovery

  1. Configure Github discovery with wildcard matching.
  2. Run backstage and let the GH discovery process run.

Via WebUI

  1. Go to โ€˜catalog-importโ€™ page.
  2. Import the generated file.
  3. Notice that validation is successful and ready to import

๐Ÿ“ƒ Provide the context for the Bug.

No response

๐Ÿ–ฅ๏ธ Your Environment

OS: Darwin 22.3.0 - darwin/x64 node: v16.13.0 yarn: 1.22.19 cli: 0.20.0 (installed) backstage: 1.7.2

Dependencies: @backstage/app-defaults 1.0.7 @backstage/backend-app-api 0.2.2 @backstage/backend-common 0.15.2 @backstage/backend-plugin-api 0.1.3 @backstage/backend-tasks 0.3.6 @backstage/backend-test-utils 0.1.29 @backstage/catalog-client 1.1.1 @backstage/catalog-model 1.1.2 @backstage/cli-common 0.1.10 @backstage/cli 0.20.0 @backstage/config-loader 1.1.5 @backstage/config 1.0.3 @backstage/core-app-api 1.1.1 @backstage/core-components 0.11.2 @backstage/core-plugin-api 1.0.7 @backstage/dev-utils 1.0.7 @backstage/errors 1.1.2 @backstage/integration-react 1.1.5 @backstage/integration 1.3.2 @backstage/plugin-api-docs 0.8.10 @backstage/plugin-app-backend 0.3.37 @backstage/plugin-auth-backend 0.17.0 @backstage/plugin-auth-node 0.2.6 @backstage/plugin-badges-backend 0.1.31 @backstage/plugin-badges 0.2.34 @backstage/plugin-bazaar-backend 0.2.0 @backstage/plugin-bazaar 0.1.25 @backstage/plugin-catalog-backend-module-github 0.1.8 @backstage/plugin-catalog-backend 1.5.0 @backstage/plugin-catalog-common 1.0.7 @backstage/plugin-catalog-graph 0.2.22 @backstage/plugin-catalog-import 0.9.0 @backstage/plugin-catalog-node 1.2.0 @backstage/plugin-catalog-react 1.2.0 @backstage/plugin-catalog 1.6.0 @backstage/plugin-cost-insights-common 0.1.1 @backstage/plugin-cost-insights 0.11.32 @backstage/plugin-explore-react 0.0.22 @backstage/plugin-explore 0.3.41 @backstage/plugin-github-actions 0.5.10 @backstage/plugin-github-issues 0.1.2 @backstage/plugin-github-pull-requests-board 0.1.4 @backstage/plugin-home 0.4.26 @backstage/plugin-kubernetes-backend 0.7.3 @backstage/plugin-kubernetes-common 0.4.3 @backstage/plugin-kubernetes 0.7.3 @backstage/plugin-org 0.5.10 @backstage/plugin-permission-common 0.7.0 @backstage/plugin-permission-node 0.7.0 @backstage/plugin-permission-react 0.4.6 @backstage/plugin-proxy-backend 0.2.31 @backstage/plugin-scaffolder-backend 1.7.0 @backstage/plugin-scaffolder-common 1.2.1 @backstage/plugin-scaffolder 1.7.0 @backstage/plugin-search-backend-module-elasticsearch 1.0.3 @backstage/plugin-search-backend-module-pg 0.4.1 @backstage/plugin-search-backend-node 1.0.3 @backstage/plugin-search-backend 1.1.0 @backstage/plugin-search-common 1.1.0 @backstage/plugin-search-react 1.2.0 @backstage/plugin-search 1.0.3 @backstage/plugin-shortcuts 0.3.2 @backstage/plugin-sonarqube-backend 0.1.2 @backstage/plugin-sonarqube 0.4.2 @backstage/plugin-stack-overflow 0.1.6 @backstage/plugin-tech-insights-backend-module-jsonfc 0.1.21 @backstage/plugin-tech-insights-backend 0.5.3 @backstage/plugin-tech-insights-common 0.2.7 @backstage/plugin-tech-insights-node 0.3.5 @backstage/plugin-tech-insights 0.3.2 @backstage/plugin-tech-radar 0.5.17 @backstage/plugin-techdocs-addons-test-utils 1.0.5 @backstage/plugin-techdocs-backend 1.4.0 @backstage/plugin-techdocs-module-addons-contrib 1.0.5 @backstage/plugin-techdocs-node 1.4.1 @backstage/plugin-techdocs-react 1.0.5 @backstage/plugin-techdocs 1.3.3 @backstage/plugin-todo-backend 0.1.34 @backstage/plugin-todo 0.2.12 @backstage/plugin-user-settings 0.5.0 @backstage/release-manifests 0.0.6 @backstage/test-utils 1.2.1 @backstage/theme 0.2.16 @backstage/types 1.0.0 @backstage/version-bridge 1.0.1 โœจ Done in 3.71s.

๐Ÿ‘€ Have you spent some time to check if this bug has been raised before?

  • I checked and didnโ€™t find similar issue

๐Ÿข Have you read the Code of Conduct?

Are you willing to submit PR?

None

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 15 (3 by maintainers)

Most upvoted comments

@mscam - Thatโ€™s great! I was curious about the issue and was about to suggest another possibility in the config that I experienced when my org had templates in different repos and we had to add another rule for Location as it was not importing the templates.

Good that you have resolved the issue.

definitely yes. I would try providing a Location entity that acts as an โ€œindexโ€ for all service entities ๐Ÿ’ฏ maybe it helps

@mscam - Iโ€™m not near my work pc atm but from memory, Iโ€™ve increased mine to 1 hour with frequency level at 1 hour and 10 minutes.

In terms of data, I have nearly triple the amount that you have (including APIs and Documentations). I havenโ€™t gone into multi events for processing yet. IIRC, its the latest feature that is available for implementation.

UPDATE: If you manual import the catalog with success for location entity but the component is missing, I think thereโ€™s an actual error happening behind the scenes here? Are you able to find any logs related to the missing component? Apologies, AWS integration for Backstage isnโ€™t my forte.