LensorCompilerCollection: Variables with the same name as registers break Intel assembly

rax :: 69
putchar : ext integer(ch : integer)
putchar(rax)

When compiling with --dialect intel no warnings/errors are produced and the following code is generated:

.section .data
rax: .byte 69,0,0,0,0,0,0,0
.intel_syntax noprefix
.section .text

.global main
.global putchar

main:
    push rbp
    mov rbp, rsp
    sub rsp, 0
.L0:
    lea rax, [rip + rax]
    mov rdi, [rip + rax]
    sub rsp, 8
    push rcx
    push rsi
    push rdi
    call putchar
    pop rdi
    pop rsi
    pop rcx
    add rsp, 8
    mov rcx, rax
    mov rax, rcx
    mov rsp, rbp
    pop rbp
    ret

Same as in #46, the issue only occurs during assembly:

code.S: Assembler messages:
code.S:14: Error: `[rip+rax]' is not a valid base/index expression
code.S:15: Error: `[rip+rax]' is not a valid base/index expression

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 18 (4 by maintainers)

Most upvoted comments

I feel like if GCC doesn’t handle this case, then we don’t really need to care too much about handling it either. I recall LLVM-generated assembly not being particularly compilable either if you use… less orthodox label names…

so that shouldn’t be too bad

I forgot for a second that this also entails assembling the code, so er, that might take a bit longer haha

I guess the ‘solution’ to this is to disallow .intel_syntax noprefix and require .intel_syntax prefix instead when emitting assembly for GAS as GAS doesn’t seem to have a way of handling these kinds of symbols… If we ever add a NASM backend, we can solve this problem as described in the previous paragraph.

I guess the ‘easiest’ way of solving this issue would be to simply add a flag to the codegen context (bool intel_syntax_use_prefix or sth like that) and then depending on that, we choose whether to emit a prefix in femit_*(). That flag can then be set or not depending on a command line option. Speaking of, we should probably add an option like --assembler or sth like that that accepts different values (GAS being the only one atm), and depending on that and on the assembly dialect, we can choose whether to use prefixes or not, as well as what directives to emit etc.

It seems that GCC avoids dealing w/ this by simply emitting object files instead 👁️