This commit adds support for LLVM 16 and switches to it by default. That
means three LLVM versions are supported at the same time: LLVM 14, 15,
and 16.
This commit includes work by QuLogic:
* Part of this work was based on a PR by QuLogic:
https://github.com/tinygo-org/tinygo/pull/3649
But I also had parts of this already implemented in an old branch I
already made for LLVM 16.
* QuLogic also provided a CGo fix here, which is also incorporated in
this commit:
https://github.com/tinygo-org/tinygo/pull/3869
The difference with the original PR by QuLogic is that this commit is
more complete:
* It switches to LLVM 16 by default.
* It updates some things to also make it work with a self-built LLVM.
* It fixes the CGo bug in a slightly different way, and also fixes
another one not included in the original PR.
* It does not keep compiler tests passing on older LLVM versions. I
have found this to be quite burdensome and therefore don't generally
do this - the smoke tests should hopefully catch most regressions.
This gives a small improvement now, and is needed to be able to use the
Heap2Stack transform that's available in the Attributor pass. This
Heap2Stack transform could replace our custom OptimizeAllocs pass.
Most of the changes are just IR that changed, the actual change is
relatively small.
To give an example of why this is useful, here is the code size before
this change:
$ tinygo build -o test -size=short ./testdata/stdlib.go
code data bss | flash ram
95620 1812 968 | 97432 2780
$ tinygo build -o test -size=short ./testdata/stdlib.go
code data bss | flash ram
95380 1812 968 | 97192 2780
That's a 0.25% reduction. Not a whole lot, but nice for such a small
patch.
I found that when I enable ThinLTO, a miscompilation triggers that had
been hidden all the time previously. The bug appears to happen as
follows:
1. TinyGo generates a function with a runtime.trackPointer call, but
without an alloca (or the alloca gets optimized away).
2. LLVM sees that no alloca needs to be kept alive across the
runtime.trackPointer call, and therefore it adds the 'tail' flag.
One of the effects of this flag is that it makes it undefined
behavior to keep allocas alive across the call (which is still safe
at that point).
3. The GC lowering pass adds a stack slot alloca and converts
runtime.trackPointer calls into alloca stores.
The last step triggers the bug: the compiler inserts an alloca where
there was none before but that's not valid as long as the 'tail' flag is
present.
This patch fixes the bug in a somewhat dirty way, by always creating a
dummy alloca so that LLVM won't do the optimization in step 2 (and
possibly other optimizations that rely on there being no alloca
instruction).