backstage: [catalog] Users should get clear information about refresh loop warnings/errors
Feature Suggestion
As the entity refresh loop runs, a lot of information is emitted as logs - for example, if entities fail validation or if the source location could not be read. This information is not made trivially available for the end user, which makes it very hard to fix a problem e.g. with a syntax error in your YAML file.
All refresh loop information that is relevant to end users should be viewable directly in the catalog interface. At the very least, on an entity overview page you should see any warnings and errors related to your entity declaration. It would also be helpful to be able to see a clear status marker of the high level state when looking at entities in a list view (for example, an orange warning marker telling you that something needs your attention).
Possible Implementation
For the catalog frontend plugin, it should show warning/error info on the entity overview pages.
For the catalog backend plugin, it should gather up information as it processes entities and persist that information to the database as well as writing to logs. The information should be tagged with the current location and the current entity, where available, so that it can easily be queried for after the fact.
The catalog /entities and /locations APIs should emit at least some summary of this status information as part of their output.
Future extension
Additional APIs could be added to extract more detailed information, such as all log entries relevant to the refresh run, in a future addition.
Additional APIs could be added to let users “dry run” an ingestion which goes through the full processing loop but does not write out the entities. This can then be used for PR feedback and other validation tools.
Context
Entity declaration files are a major contact surface for users, and the “asynchronous black box” of refreshes must expose enough information for users to be effective.
Related issues:
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 4
- Comments: 17 (15 by maintainers)
I hijacked the issue a bit with the dry run 😉 Let’s get back to original topic. Showing the current status of ingestion. I looked a bit around what already is in place:
/api/catalog/locationsalready contains{"currentStatus":{"message":null,"status":"success","timestamp":"2020-11-25 10:12:27"}in its response. For now it isn’t working correctly. I couldn’t make it display failures, for example for syntax errors.locationsCatalog.logUpdateFailuretherefore they aren’t included. If I fix that, locations are still always set to success afterwards here.refreshSingleLocation. If I fix that, I still have the issue that the status is not only logged for every location, but also every entity here. The successful entities override the broken location. I can remove that too and have a result:"currentStatus":{"message":"YAML error, YAMLSyntaxError: All collection items must start at the same column","status":"fail","timestamp":"2020-11-25 10:36:31"}.currentStatusonly returns the last logged status. I wonder if we should include some kind of “run-id” (or timestamp of the run) that is used to collect all status details from a run of a location and return them together in/api/catalog/locations./api/catalog/entitieswith thecurrentStatus, we can filter the log by entity and include the entity specific status. Right now it only contains the entity name, should it be theuidinstead? Entity names are only unique per kind? We might also include the logs for the location here. Syntax errors only affect the location and not the entity. They will cause the entity ingestion to be skipped without an error.I would like to have some opinions on that and could create a PR.
To show the status in the about card like Stefan suggested, we have two options:
managed-by-locationin frontend and retrieve the status of a location. Additional call, but is possible without changing the backend./api/catalog/entitiesendpoint to resolve thecurrentStatusin the backend. Probably the long term solution, as parsingmanaged-by-locationfeels strange.Beside that I would go with the suggested UI from Stefan. If it’s not green, the use can select “Show More” and sees the error message in a dialog.
One option is to use this UX:
If the status is not green, maybe we show a “More details” button that pops up a dialog with more details. @katz95 any thoughts?
Hey @alzafacon - this is actually being actively worked on right now by the maintainers!
@freben The notification for your response is already sitting in my inbox for months… 😆 That is also some nice input.
I don’t have anything new progress to add but I noticed that the proposed dialog might also be a good location to host some kind of “entity source” view. But not the actual source that was at the location, but how the entity looks once it was ingested. That would sometimes be nice for debugging. But for now dev tools are fine.