Assembly emitter
x86-64 has a lot of instructions. They are described in Volume 2 of the 5 Volume "Intel® 64 and IA-32 Architectures Software Developer’s Manual". Just this volume alone is over 2000 pages, which would take forever to fully implement. As a result, we will use only a subset of these instructions. This the rough plan:
- Most instructions like
addwill only be implemented withr64 r64versions. - To accomplish something like
add rax, 1, we will use a temporary registerXmov X, 1add rax, X- The constant propagation system will be able to provide enough information that we could eventually use
add r64 immXand similar if needed. - Register allocation should handle the case
(set! x (+ 3 y))as:mov x, 3add x, y
- but
(set! x (+ y 3)), in cases whereyis needed after andxcan't take its place, will become the inefficientmov x, ymov rtemp, 3add x, rtemp
- Loading constants into registers will be done efficiently, using the same strategy used by modern versions of
gccandclang. - Memory access will be done in the form
mov rdest, [roff + raddr]whereroffis the offset register. Doing memory access in this form was found to be much faster in simple benchmark test. - Memory access to the stack will have an extra
suband more complicated dereference. GOAL code seems to avoid using the stack in most places, and I suspect the programmers attempted to avoid stack spills.mov rdest, rsp: coloring move for upcoming subtractsub rdest, roff: convert real pointer to GOAL pointermov rdest, [rdest + roff + variable_offset]: access memory through normal GOAL deref.- Note - we should check that the register allocator gets this right always, and eliminates moves and avoid using a temporary register.
- Again, the constant propagation should give use enough information, if we ever want/need to implement a more efficient
mov rdest, [rsp + variable_offset]type instructions.
- Memory access to static data should use
ripaddressing, likemov rdest, [rip + offset]. And creating pointers to static data could belea rdest, [rip - roff + offset]