backstage: Backstage doesn't recognize branches that contain forward slashes
Backstage relies on git-url-parse to parse URLs. This library has a bug that causes it to incorrectly parse URLs that point to branches with forward slashes in them.
Expected Behavior
I expect to be able to use branches with / because it’s a useful way to organized branches into groups.
Current Behavior
If I attempt to create a template from a branch that uses a / in its name, the operation will fail because Backstage (via git-url-parse) will assume that I’m trying to get code from branch called tm. It will ignore everything past first /.
Possible Solution
🤷♀️
Steps to Reproduce
- Try to create a template from https://github.com/spotify/backstage/blob/dependabot/npm_and_yarn/testing-library/jest-dom-5.11.4/plugins/scaffolder-backend/sample-templates/create-react-app/template.yaml
Context
Some of the projects I’m involved in require that each team member prefixes their branch with their initials. It’s a habit at this point to start every branch name with tm/.
About this issue
- Original URL
- State: open
- Created 4 years ago
- Reactions: 11
- Comments: 47 (36 by maintainers)
De-stale-inating. This is still a bug. We had someone else run into it again today.
Evening, I’m not a developer but doing loops to attempt to work out if its a branch or a path seems inelegant.
In other projects I’ve seen the URL is normally constructed for git repos with parameters that get extracted to let the client libs know what the branch is;
For example;
Terraform
git::https://example.com/vpc.git?ref=v1.2.0Helmfilename: remote-repo url: git+https://github.com/repo/test@deploy/helm?ref=masterkustomize
github.com/kubernetes-sigs/kustomize/examples/multibases?ref=v1.0.6@mtlewis and I spent a hot second looking at the code in question. As mentioned, there’s not enough information in the URL to tell branch name versus path when the branch has a slash. I didn’t find any alternative URL formats that work with GitHub where this might be possible.
This code could have some additional logic, though. I think we could either:
/Since (1) could suffer on a repo with a ton of branches, and could have false positives, the second seems preferable to me. We could retry N times, expanding to the next slash each time, where N is a fairly low number.
Like I commented in https://github.com/IonicaBizau/git-url-parse/pull/114#issuecomment-706091677, I don’t think there is a way to split the url itself into branch name and filename… Pinging @izuzak here just in case he has any ideas how this can be improved. ❇️
I am very interested in resolving this issue. What do you think of this proposal? https://github.com/backstage/backstage/pull/16407 The PR leverages the GitHub url of type https://github.com/santunioni/backstage/blob/fix/integration/github/branches-with-slashs-on-their-name?path=catalog-info.yaml to differentiate between the branch name and catalog path.
The request doesn’t implement any additional request to github api.
I’d like some feedback and insights there
Way forward with this issue is what was started in https://github.com/backstage/backstage/pull/16407 but unfortunately abandoned
@santunioni Sorry that we (I) have been slow - love your idea, will get to reviewing it properly eventually
And to answer your question, we’d appreciate help with it.
@freben I like the idea of “alternative form” which would solve the problem but didn’t require any groundbreaking changes. I like format where
refis supplied as a query string. That solution would make it extensible also for other information in future - if needed. Do you guys at @backstage prefer to implement this yourself? Or is this okay to be done by a community? Do you have any requirements & tips how you would like this to be implemented (and where)?Sorry but this is still a current issue. There’s some discussion in the comments above. Having unencoded slashes in the URL makes it super problematic to parse and understand GitHub URLs - there’s no surefire way to actually know where the branch ends and the path begins, without resorting to hammering the GitHub API with queries to resolve the situation - and this needs to be done in a lot of places where the code isn’t even async ready, but rather expects to be able to do that parsing instantly and locally.
@eperper thanks for those pointers - interesting. One thing to take into account here is convenience of copy+paste as mentioned by @Rugvip above. Having to form a custom url might be error prone. That being said, I think that approach can be worth considering nonetheless, especially if it’s an “alternative form” that you can use only when necessary or if it’s easily available somehow in the web interface besides the address bar of the browser.