backstage: Using integration for `GithubOrgReaderProcessor` fails when user email is hidden

When using the GitHub integration as client for the GitHubOrgReaderProcesseror as implemented in #5602 the github-org catalog location fails to sync when the organization contains one or more members with their email set to hidden.

Expected Behavior

I expected the github-org location sync to work with the integration client.

Current Behavior

The sync failes with the following error:

Failed item in location github-org:https://github.com/<redacted>, Error: Processor GithubOrgReaderProcessor threw an error while reading location github-org:https://github.com/<redacted>, GraphqlError: Resource not accessible by integration

Possible Solution

The tldr; here is that the integration does not have access to email adresses when they are set to hidden. This seems to be by design and the general workaround is to make a user-server request, which is not possible in this case. One solution would be to remove the email field from the query, although this feels like a sledge-hammer approach with a big breaking change.

Another option would be to see if this field can be made optional in the query, so it returns an empty value instead of throwing an error. This would need to be implemented by GitHub though, if I’m not mistaken.

Handling this case in backstage looks to be problematic as the error bubbles up from the @octokit GraphQL client: https://github.com/octokit/graphql.js/blob/master/src/graphql.ts#L73 . The response will have null values for all nodes that have a hidden email, which would result in an incomplete sync.

Steps to Reproduce

  1. Configure the GitHub App integration as described here
  2. Ensure that your GitHub App has all the permissions it needs for the github-org location integration
  3. Check that you are part of the configured organization and set email to private in your settings
  4. Enable the github-org location integration
  5. Wait for the sync to be triggered

Context

The context is that we would like to sync user and team data using the GitHub App integration so that the catalog shows items as owned by the current logged in user. See Possible Solution for more info.

Your Environment

  • NodeJS Version (v12): v14.17.0
  • Operating System and Version (e.g. Ubuntu 14.04): Ubuntu 20.10
  • Browser Information: -
  • Backend dependencies:
  "dependencies": {
    "app": "0.0.0",
    "@backstage/backend-common": "^0.8.0",
    "@backstage/catalog-model": "^0.7.8",
    "@backstage/catalog-client": "^0.3.11",
    "@backstage/config": "^0.1.5",
    "@backstage/plugin-app-backend": "^0.3.12",
    "@backstage/plugin-auth-backend": "^0.3.9",
    "@backstage/plugin-catalog-backend": "^0.9.0",
    "@backstage/plugin-proxy-backend": "^0.2.7",
    "@backstage/plugin-scaffolder-backend": "^0.11.0",
    "@backstage/plugin-techdocs-backend": "^0.8.0",
    "@gitbeaker/node": "^28.0.2",
    "@octokit/rest": "^18.5.3",
    "dockerode": "^3.2.1",
    "express": "^4.17.1",
    "express-promise-router": "^4.1.0",
    "knex": "^0.21.6",
    "sqlite3": "^5.0.0",
    "winston": "^3.2.1"
  },

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 6
  • Comments: 25 (21 by maintainers)

Most upvoted comments

Closing as this was fixed by https://github.com/backstage/backstage/pull/5769, which will be released on Thursday

This got an official-ish response from GitHub. Interestingly, someone replied that the behavior changed quite recently from returning a null email to throwing the error. 🙁 Unfortunate.

I’m wondering if maybe just skipping the email entirely is easier. Since these are GitHub profiles, how often will these emails be useful? I may be going out on a limb here, but wouldn’t the majority of them would be personal versus work related? If so, I’m not sure how useful the email is for an organization.

We’re running into this issue as well, just a +1.