containerd: including git attributes in "vendor" makes archive checksum change
Description
Due to including git attributes, the archive gets different checksums:
vendor/k8s.io/client-go/pkg/version/.gitattributes
This is because the amount of “significant digits” varies, in the git rev.
vendor/k8s.io/client-go/pkg/version/base.go
Steps to reproduce the issue
Describe the results you received and expected
ERROR: v1.5.8.tar.gz has wrong sha256 hash:
ERROR: expected: a41ab8d39393c9456941b477c33bb1b221a29b635f1c9a99523aab2f5e74f790
ERROR: got : 0890f7b0ee8e20a279a617c60686874b3c7a99e064adb2b38d884499b5284c43
ERROR: Incomplete download, or man-in-the-middle (MITM) attack
What version of containerd are you using?
v1.5.8
Any other relevant information
No response
Show configuration if it is related to CRI plugin.
No response
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 40 (23 by maintainers)
Resolving. The
main
branch doesn’t have the issue anymore and we wouldn’t upgrade Kubernetes dependencies in 1.6 and/or 1.5.Having git commits and build dates in source code and in binary releases is mostly useless, except for causing confusion.
The commit is not really needed, when versions are tagged. As seen by having a commit from the wrong git repository ?
And the build date makes it hard to do reproducible builds. It is also frequently wrong, making go binaries live in the 70’s.
To be useful, there would need to make some kind of
make dist
and distfiles - instead of exporting git archives on GitHub.After docker upgrading containerd to 1.5.10, this started happening again.
Note that the historic releases mentioned above (1.5.8, 1.5.9) also broke…
So only “fixed” for containerd 1.6.
@thaJeztah @afbjorklund @jonyhy96 thanks for helping nail that down! I’ve created https://github.com/kubernetes/publishing-bot/pull/285 in Kubernetes to stop including
.gitattributes
files in k8s libraries. That should solve the issue for the future.It was calculated the same way, just some time ago (the contents vary, over time)
This is because the length of the git hash varies, due to random factors on GitHub.
It might be
1e5ef943e
today, and then it could be1e5ef943eb
tomorrow ?The “long” hash remains at:
1e5ef943eb76627a6d3b6de8cd1ef6537f393a71
Ps, for
wget
the flag is called--content-disposition
.The workarounds only last for “so long”, until the number of signficants digits in the commit changes again:
They also flip back and forth, depending on which server the GitHub workloads ends up on running on, etc.
Which lessens the confidence in having checksums in the first place
Minikube sorta made it worse by using the wrong file name (forgot the
-J
option tocurl
, or a make option)And by not stating clearly that it was “computed locally”, like our OS upstream so carefully did (and we ignored)
So a much better checksum file looks like: (it wasn’t used because of the older version, 1.4.4 and not 1.5.8)
https://github.com/buildroot/buildroot/blob/2021.02.4/package/docker-containerd/docker-containerd.hash
When upstream doesn’t publish the checksums of a tarball, it is normally computed locally at the time of import.
This also goes if upstream uses a different checksum algorithm, like if you want sha512 but it only has sha256
But ultimately, it’s even signed.
Note that it is not the checksum of the source code, that would be contained in the git commit itself (via tree etc)
It is the checksum after first doing dist transformations, and then applying compression (maybe another timestamp)
Debian uses “pristine-tar” for this.
https://git-scm.com/docs/gitattributes#_export_subst
https://github.com/containerd/containerd/blob/v1.5.8/vendor/k8s.io/client-go/pkg/version/base.go#L59
This will potentially change the output, every time that GitHub does a “git archive” for you
The alternative would be to generate and attach a static tarball, which is not really practical (and wasteful)
The killer here is using the “short” hash.