cc65: Regression: "0 < i--" is optimized to "0 < --i".
I have tried to build “Sir Ababol” using the latest version of cc65. For some reason, it had a very glitchy title screen. I have investigated it a bit, and it turned out that the issue is in this function:
void __fastcall__ un_rle_screen (unsigned char *rle) {
wx = 0; wy = 0; i = 0;
while (i < 240) {
t = (*rle);
rle ++;
if (t & 128) {
top = t & 127;
t = (*rle);
rle ++;
while (0 < top--) {
draw_tile (wx, wy, t);
wx = (wx + 2) & 31; if (!wx) wy += 2;
i ++;
}
} else {
draw_tile (wx, wy, t);
wx = (wx + 2) & 31; if (!wx) wy +=2;
i ++;
}
}
// 64 attributes
i = 64; gp_addr = 0x23c0;
while (0 < i--) {
vram_adr (gp_addr++);
vram_put (*rle++);
}
}
Just look at this part of generated assembly:
;
; while (0 < top--) {
;
L330D: lda #$00
dec _top
cmp _top
bcc L330B
;
; } else {
;
jmp L3B6B
It decrements _top; and only then, it does the comparison. But, it should do comparison first; and only then, it should decrement _top. I have replaced it to this code:
L330D:
lda _top
dec _top
cmp #$00
beq L3B68
bcs L330B
jmp L3B68
And now, it works as expected.
Old cc65 had generated this monstrous code:
;
; while (0 < top--) {
;
L330D: jsr push0
lda _top
pha
sec
sbc #$01
sta _top
pla
jsr tosicmp0
bcc L330B
;
; } else {
;
jmp L3B6B
It seems that someone had tried to add a great optimization, and forgot to take into account the difference between --i and i--. It would be really nice to preserve the optimization, but to do it in a right way.
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Comments: 17 (16 by maintainers)
Commits related to this issue
- Fixed Issue #784. — committed to acqn/cc65 by acqn 4 years ago
- Improved fix for Issue #167 and #784, and somehow #781. — committed to acqn/cc65 by acqn 4 years ago
- Fixed Issue #784. — committed to acqn/cc65 by acqn 4 years ago
- Improved fix for Issues #167 and #784 and somehow #781. — committed to acqn/cc65 by acqn 4 years ago
- Fixed Issue #784. — committed to cc65/cc65 by acqn 4 years ago
- Improved fix for Issues #167 and #784 and somehow #781. — committed to cc65/cc65 by acqn 4 years ago
Fixed with https://github.com/cc65/cc65/pull/1053.
Register usage isn’t currently tracked correctly for OptStackOps transformations, which is the cause of this issue as well as #167. I’m working on an improved way to handle this.
The error is introduced in the compiler pass “OptStackOps”. I used a slightly simplified version of the test program posted by @greg-king5 .
Last good pass:
After
OptStackOps:On line (A), a new
cmpinstruction is inserted by the optimizer pass, testingcountafter it has already been decremented, despite the fact that it was post-decremented.The unoptimized program succeeds. The optimized program fails.
its still close to impossible to fix anything without a minimal testcase (and then the testcase will go into the regression tests, to make sure it wont break again)