This commit adds support for LLVM 16 and switches to it by default. That
means three LLVM versions are supported at the same time: LLVM 14, 15,
and 16.
This commit includes work by QuLogic:
* Part of this work was based on a PR by QuLogic:
https://github.com/tinygo-org/tinygo/pull/3649
But I also had parts of this already implemented in an old branch I
already made for LLVM 16.
* QuLogic also provided a CGo fix here, which is also incorporated in
this commit:
https://github.com/tinygo-org/tinygo/pull/3869
The difference with the original PR by QuLogic is that this commit is
more complete:
* It switches to LLVM 16 by default.
* It updates some things to also make it work with a self-built LLVM.
* It fixes the CGo bug in a slightly different way, and also fixes
another one not included in the original PR.
* It does not keep compiler tests passing on older LLVM versions. I
have found this to be quite burdensome and therefore don't generally
do this - the smoke tests should hopefully catch most regressions.
This adds true GOOS=wasip1 support in addition to our existing
-target=wasi support. The old support for WASI isn't removed, but should
be treated as deprecated and will likely be removed eventually to reduce
the test burden.
It looks like this on my system, for example:
{
"target": {
"llvm-target": "aarch64-unknown-linux",
"cpu": "generic",
"features": "+neon",
"goos": "linux",
"goarch": "arm64",
"build-tags": [
"linux",
"arm64"
],
"gc": "precise",
"scheduler": "tasks",
"linker": "ld.lld",
"rtlib": "compiler-rt",
"libc": "musl",
"default-stack-size": 65536,
"ldflags": [
"--gc-sections"
],
"extra-files": [
"src/runtime/asm_arm64.S",
"src/internal/task/task_stack_arm64.S"
],
"gdb": [
"gdb"
],
"flash-1200-bps-reset": "false"
},
"goroot": "/home/ayke/.cache/tinygo/goroot-23c311bcaa05f188affa3c42310aba343acc82562d5e5f04dea9d5b79ac35f7e",
"goos": "linux",
"goarch": "arm64",
"goarm": "6",
...
}
This can be very useful while working on the automatically generated
target object for example (in my case, GOOS=wasip1).
Browsers previously didn't support the WebAssembly i64 type, so we had
to work around that limitation by converting the LLVM i64 type to
something else. Some people used a pair of i32 values, but we used a
pointer to a stack allocated i64.
Now however, all major browsers and Node.js do support WebAssembly
BigInt integration so that i64 values can be passed back and forth
between WebAssembly and JavaScript easily. Therefore, I think the time
has come to drop support for this workaround.
For more information: https://v8.dev/features/wasm-bigint (note that
TinyGo has used a slightly different way of passing i64 values between
JS and Wasm).
For information on browser support: https://webassembly.org/roadmap/
Some vector registers must be preserved across calls, but this wasn't
happening on Linux and MacOS. When I added support for windows/arm64, I
saw that it required these vector registers to be preserved and assumed
this was Windows deviating from the standard calling convention. But
actually, Windows was just implementing the standard calling convention
and the bug was on Linux and MacOS.
This commit fixes the bug on Linux and MacOS and at the same time merges
the Go and assembly files as they no longer need to be separate.
This was actually surprising once I got TinyGo to build on Windows 11
ARM64. All the changes are exactly what you'd expect for a new
architecture, there was no special weirdness just for arm64.
Actually getting TinyGo to build was kind of involved though. The very
short summary is: install arm64 versions of some pieces of software
(like golang, cmake) instead of installing them though choco. In
particular, use the llvm-mingw[1] toolchain instead of using standard
mingw.
[1]: https://github.com/mstorsjo/llvm-mingw/releases
This implements the block-based GC as a partially precise GC. This means
that for most heap allocations it is known which words contain a pointer
and which don't. This should in theory make the GC faster (because it
can skip non-pointer object) and have fewer false positives in a GC
cycle. It does however use a bit more RAM to store the layout of each
object.
Right now this GC seems to be slower than the conservative GC, but
should be less likely to run out of memory as a result of false
positives.
Before this patch, `tinygo lldb path/to/package` would result in an
error:
(lldb) target create --arch=arm64-unknown-macosx10.12.0 "/var/folders/17/btmpymwj0wv6n50cmslwyr900000gn/T/tinygo2731663853/main"
error: the specified architecture 'arm64-unknown-macosx10.12.0' is not compatible with 'arm64-apple-macosx10.12.0' in '/var/folders/17/btmpymwj0wv6n50cmslwyr900000gn/T/tinygo2731663853/main'
This patch fixes this error.
Unfortunately, it doesn't get debug information to work yet. I still
haven't figured out what's going wrong here. But it's progress, I guess.
This flag is necessary in LLVM 15 because it appears that LLVM 15 has
changed the default target ABI from lp64 to lp64d. This results in a
linker failure. Setting the "target-abi" forces the RISC-V backend to
use the intended target ABI.
This prefix isn't actually used and only adds noise, so remove it.
It may have been useful on Linux that makes a distinction between
/dev/ttyACM* and /dev/ttyUSB* but it isn't now. Also, it's unlikely that
the same vid/pid pair will be shared between an acm and usb driver
anyway.
This is just a papercut, and not really something important. But I
noticed something weird:
$ GOOS=windows GOARCH=arm tinygo info ""
LLVM triple: armv7-unknown-windows-gnueabihf-gnu
GOOS: windows
GOARCH: arm
That -gnueabihf-gnu ending is weird, it should pick one of the two. I've
fixed it as follows:
$ GOOS=windows GOARCH=arm tinygo info ""
LLVM triple: armv7-unknown-windows-gnu
GOOS: windows
GOARCH: arm
[...]
We're probably never going to support windows/arm (this is 32-bit arm,
not arm64) so it doesn't really matter which one we pick. And this patch
shouldn't affect any other system.
This matches the flash-command and is generally a bit easier to work
with.
This commit also prepares for allowing multiple formats to be used in
the emulator command, which is necessary for the esp32.
Switch over to LLVM 14 for static builds. Keep using LLVM 13 for regular
builds for now.
This uses a branch of the upstream Espressif branch to fix an issue,
see: https://github.com/espressif/llvm-project/pull/59
This means that it will be possible to generate a Darwin binary on any
platform (Windows, Linux, and MacOS of course), including CGo. Of
course, the resulting binaries can only run on MacOS itself.
The binary links against libSystem.dylib, which is a shared library. The
macos-minimal-sdk repository contains open source header files and
generated symbol stubs so we can generate a stub libSystem.dylib without
copying any closed source code.
Some clang builds (e.g., Fedora's) enable unwind tables by default. As
tinygo does not need nor support them, that leads to undefined symbols
when linking.
This adds support for building with `-tags=llvm13` and switches to LLVM
13 for tinygo binaries that are statically linked against LLVM.
Some notes on this commit:
* Added `-mfloat-abi=soft` to all Cortex-M targets because otherwise
nrfx would complain that floating point was enabled on Cortex-M0.
That's not the case, but with `-mfloat-abi=soft` the `__SOFTFP__`
macro is defined which silences this warning.
See: https://reviews.llvm.org/D100372
* Changed from `--sysroot=<root>` to `-nostdlib -isystem <root>` for
musl because with Clang 13, even with `--sysroot` some system
libraries are used which we don't want.
* Changed all `-Xclang -internal-isystem -Xclang` to simply
`-isystem`, for consistency with the above change. It appears to
have the same effect.
* Moved WebAssembly function declarations to the top of the file in
task_asyncify_wasm.S because (apparently) the assembler has become
more strict.
This environment variable can be set to 5, 6, or 7 and controls which
ARM version (ARMv5, ARMv6, ARMv7) is used when compiling for GOARCH=arm.
I have picked the default value ARMv6, which I believe is supported on
most common single board computers including all Raspberry Pis. The
difference in code size is pretty big.
We could even go further and support ARMv4 if anybody is interested. It
should be pretty simple to add this if needed.
This change implements a new "scheduler" for WebAssembly using binaryen's asyncify transform.
This is more reliable than the current "coroutines" transform, and works with non-Go code in the call stack.
runtime (js/wasm): handle scheduler nesting
If WASM calls into JS which calls back into WASM, it is possible for the scheduler to nest.
The event from the callback must be handled immediately, so the task cannot simply be deferred to the outer scheduler.
This creates a minimal scheduler loop which is used to handle such nesting.
This makes sure that the LLVM target features match the one generated by
Clang:
- This fixes a bug introduced when setting the target CPU for all
targets: Cortex-M4 would now start using floating point operations
while they were disabled in C.
- This will make it possible in the future to inline C functions in Go
and vice versa. This will need some more work though.
There is a code size impact. Cortex-M4 targets are increased slightly in
binary size while Cortex-M0 targets tend to be reduced a little bit.
Other than that, there is little impact.
With this fix, `cflags` in the target JSON files is correctly ordered.
Previously, the cflags of a parent JSON file would come after the ones
in the child JSON file, which makes it hard to override properties in
the child JSON file.
Specifically, this fixes the case where targets/riscv32.json sets
`-march=rv32imac` and targets/esp32c3.json wants to override this using
`-march=rv32imc` but can't do this because its `-march` comes before the
riscv32.json one.
The target triples have to match mostly to be able to link LLVM modules.
Linking LLVM modules is already possible (the triples already match),
but testing becomes much easier when they match exactly.
For macOS, I picked "macosx10.12.0". That's an old and unsupported
version, but I had to pick _something_. Clang by default uses
"macos10.4.0", which is much older.
This commit adds support for musl-libc and uses it by default on Linux.
The main benefit of it is that binaries are always statically linked
instead of depending on the host libc, even when using CGo.
Advantages:
- The resulting binaries are always statically linked.
- No need for any tools on the host OS, like a compiler, linker, or
libc in a release build of TinyGo.
- This also simplifies cross compilation as no cross compiler is
needed (it's all built into the TinyGo release build).
Disadvantages:
- Binary size increases by 5-6 kilobytes if -no-debug is used. Binary
size increases by a much larger margin when debugging symbols are
included (the default behavior) because musl is built with debugging
symbols enabled.
- Musl does things a bit differently than glibc, and some CGo code
might rely on the glibc behavior.
- The first build takes a bit longer because musl needs to be built.
As an additional bonus, time is now obtained from the system in a way
that fixes the Y2038 problem because musl has been a bit more agressive
in switching to 64-bit time_t.
This is for consistency with Clang, which always adds a CPU flag even if
it's not specified in CFLAGS.
This commit also adds some tests to make sure the Clang target-cpu
matches the CPU property in the JSON files.
This does have an effect on the generated binaries. The effect is very
small though: on average just 0.2% increase in binary size, apparently
because Cortex-M3 and Cortex-M4 are compiled a bit differently. However,
when rebased on top of https://github.com/tinygo-org/tinygo/pull/2218
(minsize), the difference drops to -0.1% (a slight decrease on average).
It is better to use environment variables (GOOS and GOARCH) for
consistency instead of providing two slightly incompatible ways. This
-target flag should only be used to specify a .json file (either
directly or in the TinyGo targets directory). Previously it was possible
to specify the LLVM target as well but that was never really fully
supported.
So:
- To specify a different OS/arch like you would in regular Go, use
GOOS and GOARCH.
- To specify a microcontroller chip or board, use the -target flag.
Also remove the old `os.Setenv` which might have had a purpose long ago
but doesn't have a purpose now.
... instead of setting a special -target= value. This is more robust and
makes sure that the test actually tests different arcitectures as they
would be compiled by TinyGo. As an example, the bug of the bugfix in the
previous commit ("arm: use armv7 instead of thumbv7") would have been
caught if this change was applied earlier.
I've decided to put GOOS/GOARCH in compileopts.Options, as it makes
sense to me to treat them the same way as command line parameters.
At the moment, thumbv7 is crashing. I'm not exactly sure why, but it
appears that there is an unknown instruction in __aeabi_uldivmod
(probably from libgcc).
I've fixed this by switching to armv7, which is also somewhat modern.
Maybe we can switch back to Thumb2 (aka thumbv7) once we start using
musl and compiler-rt. In the meantime, this does fix a miscompilation
(illegal instruction).
This commit changes a target triple like "armv6m-none-eabi" to
"armv6m-unknown-unknow-eabi". The reason is that while the former is
correctly parsed in Clang (due to normalization), it wasn't parsed
correctly in LLVM meaning that the environment wasn't set to EABI.
This change normalizes all target triples and uses the EABI environment
(-eabi in the triple) for Cortex-M targets.
This change also drops the `--target=` flag in the target JSON files,
the flag is now added implicitly in `(*compileopts.Config).CFlags()`.
This removes some duplication in target JSON files.
Unfortunately, this change also increases code size for Cortex-M
targets. It looks like LLVM now emits calls like __aeabi_memmove instead
of memmove, which pull in slightly more code (they basically just call
the regular C functions) and the calls themself don't seem to be as
efficient as they could be. Perhaps this is a LLVM bug that will be
fixed in the future, as this is a very common occurrence.
This reduces binary size substantially, for two reasons:
- It switches to a much more architecture ARMv4 vs ARMv7.
- It switches to Thumb2, which is a lot denser than regular ARM.
Practically all modern and not-so-modern ARM chips support Thumb2, so
this seems like a safe change to me.
The size in numbers:
- Code size for testdata/stdlib.go is reduced by about 35%.
- Binary size for testdata/stdlib.go (when compiling with -no-debug to
strip debug information) is reduced by about 16%.
Previously we used the i386 target, probably with all optional features
disabled. However, the Pentium 4 has been released a _long_ time ago and
it seems reasonable to me to take that as a minimum requirement.
Upstream Go now also seems to move in this direction:
https://github.com/golang/go/issues/40255
The main motivation for this is that there were floating point issues
when running the tests for the math package:
GOARCH=386 tinygo test math
I haven't investigated what's the issue, but I strongly suspect it's
caused by the weird x87 80-bit floating point format. This could perhaps
be fixed in a different way (by setting the FPU precision to 64 bits)
but I figured that just setting the minimum requirement to the Pentium 4
would probably be fine. If needed, we can respect the GO386 environment
variable to support these very old CPUs.
To support this newer CPU, I had to make sure that the stack is aligned
to 16 bytes everywhere. This was not yet always the case.
This can be very useful for some purposes:
* It makes it possible to disable the UART in cases where it is not
needed or needs to be disabled to conserve power.
* It makes it possible to disable the serial output to reduce code
size, which may be important for some chips. Sometimes, a few kB can
be saved this way.
* It makes it possible to override the default, for example you might
want to use an actual UART to debug the USB-CDC implementation.
It also lowers the dependency on having machine.Serial defined, which is
often not defined when targeting a chip. Eventually, we might want to
make it possible to write `-target=nrf52` or `-target=atmega328p` for
example to target the chip itself with no board specific assumptions.
The defaults don't change. I checked this by running `make smoketest`
before and after and comparing the results.