go: runtime: time goes backwards on windows-arm64 (frequent TestGcLastTime failures)
$ greplogs --dashboard -md -l -e FAIL: TestGcLastTime
2021-08-26T19:07:27-1f8d456/windows-arm64-10 2021-08-22T13:14:19-bd68459/windows-arm64-10 2021-08-19T09:11:02-3bdc179/windows-arm64-10 2021-08-16T23:03:03-df9c5d8/windows-arm64-10 2021-08-13T00:20:28-4c8ffb3/windows-arm64-10 2021-06-29T19:30:24-e294b8a/windows-arm64-10 2021-06-08T05:02:19-c20bcb6/windows-arm64-aws 2021-06-05T19:52:26-e1fa260/windows-arm64-aws
About this issue
- Original URL
- State: closed
- Created 3 years ago
- Comments: 33 (25 by maintainers)
Commits related to this issue
- runtime: on windows, read nanotime with one instruction or issue barrier On 64-bit, this is more efficient, and on ARM64, this prevents the time from moving backwards due to the weaker memory model. ... — committed to golang/go by zx2c4 3 years ago
- [release-branch.go1.17] runtime: on windows, read nanotime with one instruction or issue barrier On 64-bit, this is more efficient, and on ARM64, this prevents the time from moving backwards due to t... — committed to golang/go by zx2c4 3 years ago
- [release-branch.go1.17] runtime: use correct constant when computing nsec remainder A code comment on amd64 for windows and plan9 contained a snippet for splitting apart the sec and nsec components o... — committed to golang/go by zx2c4 3 years ago
For the modulo?
In runtime·nanotime1, the high-low-high algorithm is used to read the current 64-bit interrupt time as three 32-bit values. This code was probably copied from the AMD64 version. However, ARM64 has a looser memory model than x86/AMD64, so you need some extra memory barriers to make the algorithm work. These are missing (as @aclements points out).
The data points above are consistent with a reordering of the loads, such that the low 32 bits is not correctly matched to the high 32 bits. This can cause the low 32 bits to move backward between measurements, or more generally be wrong when the high bits are changing. The one confusing entry was where it appeared that high value was moving backward too, but that’s because aclements’s values were multiplied by 100 after they were read.
The fix is not to add memory barriers but to do a single 64-bit read of the time in one instruction. That’s what we do in Windows (and I recommend changing the AMD64 code to match, even though there’s no race condition due to the stricter memory model).
Looks like on the remaining two platforms, the constant is fine and doesn’t match the comment. But on ARM64 the constant and comment are both wrong.
Sending a patch.
Wait, they both look wrong to me…
I believe that this did not actually fix the problem. On arm64 time still moves backwards. Looking at the assembler, it appears to me that it’s rather a typo - it calculates (x / 1000000000, x % 100000000) instead of (x / 1000000000, x % 1000000000) (one zero to few in the x % expression). That I think fits perfectly with the symptoms as well.
Or am I reading it completely wrong?