This gives some more optimization opportunities to LLVM, because it
understands these intrinsics. For example, it might convert
llvm.sqrt.f64 to llvm.sqrt.f32 if possible.
This changes the compiler from treating calls to sync/atomic.* functions
as special calls (emitted directly at the call site) to actually
defining their declarations when there is no Go SSA implementation. And
rely on the inliner to inline these very small functions.
This works a bit better in practice. For example, this makes it possible
to use these functions in deferred function calls.
This commit is a bit large because it also needs to refactor a few
things to make it possible to define such intrinsic functions.
This gives the optimizer a bit more information about what the calls do.
This should result in slightly better generated code.
Code size sometimes goes up and sometimes goes down. I blame the code
size going up on the inliner which inlines more functions, because
compiling the smoke tests in the drivers repository with -opt=1 results
in a slight code size reduction in all cases.
This replaces the custom runtime.memcpy and runtime.memmove functions
with calls to LLVM builtins that should hopefully allow LLVM to better
optimize such calls. They will be lowered to regular libc memcpy/memmove
when they can't be optimized away.
When testing this change with some smoke tests, I found that many smoke
tests resulted in slightly larger binary sizes with this commit applied.
I looked into it and it appears that machine.sendUSBPacket was not
inlined before while it is with this commit applied. Additionally, when
I compared all driver smoke tests with -opt=1 I saw that many were
reduced slightly in binary size and none increased in size.