This attribute is also set by Clang when it compiles C source files
(unless -fexceptions is set). The advantage is that no unwind tables are
emitted on Linux (and perhaps other systems). It also avoids
__aeabi_unwind_cpp_pr0 on ARM when using the musl libc.
This patch adds a new pragma for functions and globals to set the
section name. This can be useful to place a function or global in a
special device specific section, for example:
* Functions may be placed in RAM to make them run faster, or in flash
(if RAM is the default) to not let them take up RAM.
* DMA memory may only be placed in a special memory area.
* Some RAM may be faster than other RAM, and some globals may be
performance critical thus placing them in this special RAM area can
help.
* Some (large) global variables may need to be placed in external RAM,
which can be done by placing them in a special section.
To use it, you have to place a function or global in a special section,
for example:
//go:section .externalram
var externalRAMBuffer [1024]byte
This can then be placed in a special section of the linker script, for
example something like this:
.bss.extram (NOLOAD) : {
*(.externalram)
} > ERAM
This allows better escape analysis even without being able to see the
entire program. This makes the stack allocation test case more complete
but probably won't have much of an effect outside of that (as the
compiler is able to infer these attributes in the whole-program
functionattrs pass).
This commit optimizes string literals and globals by setting the
appropriate alignment and using a nil pointer in zero-length strings.
- Setting the alignment for string values has a surprisingly large
effect, up to around 2% in binary size. I suspect that LLVM will
pick some default alignment for larger byte arrays if no alignment
has been specified and forcing an alignment of 1 will pack all
strings closer together.
- Using nil for zero-length strings also has a positive effect, but
I'm not sure why. Perhaps it makes some optimizations more trivial.
- Always setting the alignment on globals improves code size slightly,
probably for the same reasons setting the alignment of string
literals improves code size. The effect is much smaller, however.
This commit might have an effect on performance, but if it does this
should be tested separately and such a large win in binary size should
definitely not be ignored for small embedded systems.
This commit switches from the previous behavior of compiling the whole
program at once, to compiling every package in parallel and linking the
LLVM bitcode files together for further whole-program optimization.
This is a small performance win, but it has several advantages in the
future:
- There are many more things that can be done per package in parallel,
avoiding the bottleneck at the end of the compiler phase. This
should speed up the compiler futher.
- This change is a necessary step towards a non-LTO build mode for
fast incremental builds that only rebuild the changed package, when
compiler speed is more important than binary size.
- This change refactors the compiler in such a way that it will be
easier to inspect the IR for one package only. Inspecting this IR
will be very helpful for compiler developers.
The SimpleDCE pass was previously used to only compile the parts of the
program that were in use. However, lately the only real purpose has been
to speed up the compiler a bit by only compiling the necessary
functions.
This pass however is a problem for compiling (and caching) packages in
parallel. Therefore, this commit removes it as a preparatory step
towards that goal.
This doesn't yet add support for actually making use of variadic
functions, but at least allows (unintended) variadic functions like the
following to work:
void foo();
Moving settings to a separate config struct has two benefits:
- It decouples the compiler a bit from other packages, most
importantly the compileopts package. Decoupling is generally a good
thing.
- Perhaps more importantly, it precisely specifies which settings are
used while compiling and affect the resulting LLVM module. This will
be necessary for caching the LLVM module.
While it would have been possible to cache without this refactor, it
would have been very easy to miss a setting and thus let the
compiler work with invalid/stale data.
This is a small refactor to move code away from compiler.CompilePackage,
with the goal that compiler.CompilePackage will eventually be removed
entirely in favor of compiler.CompilePackage.
This package was long making the design of the compiler more complicated
than it needs to be. Previously this package implemented several
optimization passes, but those passes have since moved to work directly
with LLVM IR instead of Go SSA. The only remaining pass is the SimpleDCE
pass.
This commit removes the *ir.Function type that permeated the whole
compiler and instead switches to use *ssa.Function directly. The
SimpleDCE pass is kept but is far less tightly coupled to the rest of
the compiler so that it can easily be removed once the switch to
building and caching packages individually happens.
This commit merges NewCompiler and Compile into one simplifying the
external interface. More importantly, it does away with the entire
Compiler object so the public API becomes a lot smaller.
The refactor is not complete: eventually, the compiler should just
compile a single package without trying to load it first (that should be
done by the builder package).
This is a fairly big commit, but it actually changes very little.
getValue should really be a property of the builder (or frame), where
the previously created instructions are kept.
This results in a link error in the following commit (undefined
reference to runtime.trackedGlobalsBitmap from .debug_info). Solution:
don't emit debug info for declared but not defined symbols.
This is part of a larger rafactor that tries to shrink the ir package
and in general tries to shrink the amount of state that is kept around
in the compiler. The end goal is being able to compile packages
independent of each other, linking them together in a later stage. Along
the way, it cleans up lots of old cruft that has accumulated over the
months.
This refactor also results in globals being loaded lazily. This may be a
problem for some specific programs but will probably change back in a
commit in the near future.