machine/stm32, nrf: implement machine.Flash
Implements the machine.Flash interface using the same definition as the tinyfs BlockDevice.
This implementation covers the stm32f4, stm32l4, stm32wlx, nrf51, nrf52, and nrf528xx processors.
Blocking inside an interrupt is always unsafe and will lead to all kinds
of bad behavior (even if it might appear to work sometimes). So disallow
it, just like allocating heap memory inside an interrupt is not allowed.
I suspect this will usually be caused by channel sends, like this:
ch <- someValue
The easy workaround is to make it a non-blocking send instead:
select {
case ch <- someValue:
default:
}
This does mean the application might become a bit more complex to be
able to deal with this case, but the alternative (undefined behavior) is
IMHO much worse.
Some vector registers must be preserved across calls, but this wasn't
happening on Linux and MacOS. When I added support for windows/arm64, I
saw that it required these vector registers to be preserved and assumed
this was Windows deviating from the standard calling convention. But
actually, Windows was just implementing the standard calling convention
and the bug was on Linux and MacOS.
This commit fixes the bug on Linux and MacOS and at the same time merges
the Go and assembly files as they no longer need to be separate.
This is a big commit that changes the way runtime type information is stored in
the binary. Instead of compressing it and storing it in a number of sidetables,
it is stored similar to how the Go compiler toolchain stores it (but still more
compactly).
This has a number of advantages:
* It is much easier to add new features to reflect support. They can simply
be added to these structs without requiring massive changes (especially in
the reflect lowering pass).
* It removes the reflect lowering pass, which was a large amount of hard to
understand and debug code.
* The reflect lowering pass also required merging all LLVM IR into one
module, which is terrible for performance especially when compiling large
amounts of code. See issue 2870 for details.
* It is (probably!) easier to reason about for the compiler.
The downside is that it increases code size a bit, especially when reflect is
involved. I hope to fix some of that in later patches.
This was actually surprising once I got TinyGo to build on Windows 11
ARM64. All the changes are exactly what you'd expect for a new
architecture, there was no special weirdness just for arm64.
Actually getting TinyGo to build was kind of involved though. The very
short summary is: install arm64 versions of some pieces of software
(like golang, cmake) instead of installing them though choco. In
particular, use the llvm-mingw[1] toolchain instead of using standard
mingw.
[1]: https://github.com/mstorsjo/llvm-mingw/releases
I found that when I enable ThinLTO, a miscompilation triggers that had
been hidden all the time previously. The bug appears to happen as
follows:
1. TinyGo generates a function with a runtime.trackPointer call, but
without an alloca (or the alloca gets optimized away).
2. LLVM sees that no alloca needs to be kept alive across the
runtime.trackPointer call, and therefore it adds the 'tail' flag.
One of the effects of this flag is that it makes it undefined
behavior to keep allocas alive across the call (which is still safe
at that point).
3. The GC lowering pass adds a stack slot alloca and converts
runtime.trackPointer calls into alloca stores.
The last step triggers the bug: the compiler inserts an alloca where
there was none before but that's not valid as long as the 'tail' flag is
present.
This patch fixes the bug in a somewhat dirty way, by always creating a
dummy alloca so that LLVM won't do the optimization in step 2 (and
possibly other optimizations that rely on there being no alloca
instruction).