This change implements a new "scheduler" for WebAssembly using binaryen's asyncify transform.
This is more reliable than the current "coroutines" transform, and works with non-Go code in the call stack.
runtime (js/wasm): handle scheduler nesting
If WASM calls into JS which calls back into WASM, it is possible for the scheduler to nest.
The event from the callback must be handled immediately, so the task cannot simply be deferred to the outer scheduler.
This creates a minimal scheduler loop which is used to handle such nesting.
Instead of doing everything in the interrupt lowering pass, generate
some more code in gen-device to declare interrupt handler functions and
do some work in the compiler so that interrupt lowering becomes a lot
simpler.
This has several benefits:
- Overall code is smaller, in particular the interrupt lowering pass.
- The code should be a bit less "magical" and instead a bit easier to
read. In particular, instead of having a magic
runtime.callInterruptHandler (that is fully written by the interrupt
lowering pass), the runtime calls a generated function like
device/sifive.InterruptHandler where this switch already exists in
code.
- Debug information is improved. This can be helpful during actual
debugging but is also useful for other uses of DWARF debug
information.
For an example on debug information improvement, this is what a
backtrace might look like before this commit:
Breakpoint 1, 0x00000b46 in UART0_IRQHandler ()
(gdb) bt
#0 0x00000b46 in UART0_IRQHandler ()
#1 <signal handler called>
[..etc]
Notice that the debugger doesn't see the source code location where it
has stopped.
After this commit, breaking at the same line might look like this:
Breakpoint 1, (*machine.UART).handleInterrupt (arg1=..., uart=<optimized out>) at /home/ayke/src/github.com/tinygo-org/tinygo/src/machine/machine_nrf.go:200
200 uart.Receive(byte(nrf.UART0.RXD.Get()))
(gdb) bt
#0 (*machine.UART).handleInterrupt (arg1=..., uart=<optimized out>) at /home/ayke/src/github.com/tinygo-org/tinygo/src/machine/machine_nrf.go:200
#1 UART0_IRQHandler () at /home/ayke/src/github.com/tinygo-org/tinygo/src/device/nrf/nrf51.go:176
#2 <signal handler called>
[..etc]
By now, the debugger sees an actual source location for UART0_IRQHandler
(in the generated file) and an inlined function.
This generally means that code size is reduced, especially when the os
package is not imported.
Specifically:
- On Linux (which currently statically links musl), it avoids calling
malloc, which avoids including the musl C heap for small programs
saving around 1.6kB.
- On WASI, it avoids initializing the args slice when the os package
is not used. This reduces binary size by around 1kB.
This commit adds support for musl-libc and uses it by default on Linux.
The main benefit of it is that binaries are always statically linked
instead of depending on the host libc, even when using CGo.
Advantages:
- The resulting binaries are always statically linked.
- No need for any tools on the host OS, like a compiler, linker, or
libc in a release build of TinyGo.
- This also simplifies cross compilation as no cross compiler is
needed (it's all built into the TinyGo release build).
Disadvantages:
- Binary size increases by 5-6 kilobytes if -no-debug is used. Binary
size increases by a much larger margin when debugging symbols are
included (the default behavior) because musl is built with debugging
symbols enabled.
- Musl does things a bit differently than glibc, and some CGo code
might rely on the glibc behavior.
- The first build takes a bit longer because musl needs to be built.
As an additional bonus, time is now obtained from the system in a way
that fixes the Y2038 problem because musl has been a bit more agressive
in switching to 64-bit time_t.
This layout parameter is currently always nil and ignored, but will
eventually contain a pointer to a memory layout.
This commit also adds module verification to the transform tests, as I
found out that it didn't (and therefore didn't initially catch all
bugs).
This commit simplifies the IR a little bit: instead of calling
pseudo-functions runtime.interfaceImplements and
runtime.interfaceMethod, real declared functions are being called that
are then defined in the interface lowering pass. This should simplify
the interaction between various transformation passes. It also reduces
the number of lines of code, which is generally a good thing.
The division and remainder operations were lowered directly to LLVM IR.
This is wrong however because the Go specification defines exactly what
happens on a divide by zero or signed integer overflow and LLVM IR
itself treats those cases as undefined behavior. Therefore, this commit
implements divide by zero and signed integer overflow according to the
Go specification.
This does have an impact on the generated code, but it is surprisingly
small. I've used the drivers repo to test the code before and after, and
to my surprise most driver smoke tests are not changed at all. Those
that are, have only a small increase in code size. At the same time,
this change makes TinyGo more compliant to the Go specification.
This adds support for stdio in picolibc and fixes wasm_exec.js so that
it can also support C puts. With this, C stdout works on all supported
platforms.
This chip can run so much faster! Let's update the default frequency.
Also, change the UART implementation to be more fexible regarding the
clock frequency.
There were a few issues that were causing qemu-system-arm and
qemu-system-riscv to give the wrong exit codes. They are in fact capable
of exiting with 0 or 1 signalled from the running application, but this
functionality wasn't used. This commit changes this in the following
ways:
* It fixes SemiHosting codes, which were incorrectly written in
decimal while they should have been written in hexadecimal (oops!).
* It modifies all the baremetal main functions (aka reset handlers) to
exit with `exit(0)` instead of `abort()`.
* It changes `syscall.Exit` to call `exit(code)` instead of `abort()`
on baremetal targets.
* It adds these new exit functions where necessary, implemented in a
way that signals the correct exit status if running under QEMU.
All in all, this means that `tinygo test` doesn't have to look at the
output of a test to determine the outcome. It can simply look at the
exit code.
This is necessary to support the ESP32-C3, which lacks the A (atomic)
extension and thus requires these 32-bit atomic operations.
With this commit, flashing ./testdata/atomic.go to the ESP32-C3 works
correctly and produces the expected output on the serial console.
This change adds support for the ESP32-C3, a new chip from Espressif. It
is a RISC-V core so porting was comparatively easy.
Most peripherals are shared with the (original) ESP32 chip, but with
subtle differences. Also, the SVD file I've used gives some
peripherals/registers a different name which makes sharing code harder.
Eventually, when an official SVD file for the ESP32 is released, I
expect that a lot of code can be shared between the two chips.
More information: https://www.espressif.com/en/products/socs/esp32-c3
TODO:
- stack scheduler
- interrupts
- most peripherals (SPI, I2C, PWM, etc)
At startup, a large chunk of virtual memory is used up by the heap. This
works fine in emulation (qemu-arm), but doesn't work so well on an
actual Raspberry Pi. Therefore, this commit reduces the requested amount
until a heap size is found that works on the system.
This can certainly be improved, but for now it's an important fix
because it allows TinyGo built binaries to actually run on a Raspberry
Pi with just 1GB RAM.
This allows the assembly routines in these files to be stripped as dead
code if they're not referenced. This solves the link issues on MacOS
when the `leaking` garbage collector or the `coroutines` scheduler
are selected.
Fixes#2081
heapptr is assinged to heapStart (which is 0) when it's declared, but preinit()
may have moved the heap somewhere else. Set heapptr to the proper value
of heapStart when we initialize the heap properly.
This allows the leaking allocator to work on unix.
This is mainly useful to be able to run `tinygo test`, for example:
tinygo test -target=cortex-m-qemu -v math
This is not currently supported, but will be in the future.
This makes them more flexible, especially with Go 1.17 making the
situation more complicated (see
1d20a362d0).
It also makes it possible to do the same for many other functions, such
as assembly implementations of cryptographich functions which are
similarly dependent on the architecture.
Previously we used the i386 target, probably with all optional features
disabled. However, the Pentium 4 has been released a _long_ time ago and
it seems reasonable to me to take that as a minimum requirement.
Upstream Go now also seems to move in this direction:
https://github.com/golang/go/issues/40255
The main motivation for this is that there were floating point issues
when running the tests for the math package:
GOARCH=386 tinygo test math
I haven't investigated what's the issue, but I strongly suspect it's
caused by the weird x87 80-bit floating point format. This could perhaps
be fixed in a different way (by setting the FPU precision to 64 bits)
but I figured that just setting the minimum requirement to the Pentium 4
would probably be fine. If needed, we can respect the GO386 environment
variable to support these very old CPUs.
To support this newer CPU, I had to make sure that the stack is aligned
to 16 bytes everywhere. This was not yet always the case.
The math package failed the package tests on arm64 and wasm:
GOARCH=arm64 tinygo test math
Apparently the builtins llvm.maximum.f64 and llvm.minimum.f64 have
slightly different behavior on arm64 and wasm compared to what Go
expects.
This commit fixes two things:
* It changes the alignment to 16 bytes (from 4), to match max_align_t
in C.
* It manually aligns heapStart on WebAssembly, to work around a bug in
wasm-ld with --stack-first (see https://reviews.llvm.org/D106499).
This function previously returned the atomic time, that isn't affected
by system time changes but also has a time base at some arbitrary time
in the past. This makes sense for baremetal platforms (which typically
don't know the wall time) but it gives surprising results on Linux and
macOS: time.Now() usually returns a time somewhere near the start of
1970.
This commit fixes this by obtaining both time values: the monotonic time
and the wall clock time. This is also how the Go runtime implements the
time.now function.
These two heaps conflict with each other, so that if any function uses
the dlmalloc heap implementation it will eventually result in memory
corruption.
This commit fixes this by implementing all heap-related functions. This
overrides the functions that are implemented in wasi-libc. That's why
all of them are implemented (even if they just panic): to make sure no
program accidentally uses the wrong one.
The wasm build tag together with GOARCH=arm was causing problems in the
internal/cpu package. In general, I think having two architecture build
tag will only cause problems (in this case, wasm and arm) so I've
removed the wasm build tag and replaced it with tinygo.wasm.
This is similar to the tinygo.riscv build tag, which is used for older
Go versions that don't yet have RISC-V support in the standard library
(and therefore pretend to be GOARCH=arm instead).
Because arm.SVCall1 lets pointers escape, the return value of
sd_softdevice_is_enabled (passed as a pointer in a parameter) will
escape and thus this value will be heap allocated.
Use a global variable for this purpose instead to avoid the heap
allocation. This is safe as waitForEvent may only be called outside of
interrupts.
Previously, the machine.UART0 object had two meanings:
- it was the first UART on the chip
- it was the default output for println
These two meanings conflict, and resulted in workarounds like:
- Defining UART0 to refer to the USB-CDC interface (atsamd21,
atsamd51, nrf52840), even though that clearly isn't an UART.
- Defining NRF_UART0 to avoid a conflict with UART0 (which was
redefined as a USB-CDC interface).
- Defining aliases like UART0 = UART1, which refer to the same
hardware peripheral (stm32).
This commit changes this to use a new machine.Serial object for the
default serial port. It might refer to the first or second UART
depending on the board, or even to the USB-CDC interface. Also, UART0
now really refers to the first UART on the chip, no longer to a USB-CDC
interface.
The changes in the runtime package are all just search+replace. The
changes in the machine package are a mixture of search+replace and
manual modifications.
This commit does not affect binary size, in fact it doesn't affect the
resulting binary at all.
This means that machine.UART0, machine.UART1, etc are of type
*machine.UART, not machine.UART. This makes them easier to pass around
and avoids surprises when they are passed around by value while they
should be passed around by reference.
There is a small code size impact in some cases, but it is relatively
minor.
Make the GC globals scan phase conservative instead of precise on
WebAssembly. This reduces code size at the risk of introducing some
false positives.
This is a stopgap measure to mitigate an issue with the precise scanning
of globals that doesn't track all pointers. It works for regular globals
but globals created in the interp package don't always have a type and
therefore may be missed by the AddGlobalsBitmap pass.
The same issue is present on Linux and macOS, but is not as noticeable
there.
There is no reason to specialize this per chip as it is only ever used
for JavaScript. Not only that, it is causing confusion and is yet
another quirk to learn when porting the runtime to a new
microcontroller.
This commit improves the timers on various microcontrollers to better
deal with counter wraparound. The result is a reduction in RAM size of
around 12 bytes and a small effect (sometimes positive, sometimes
negative) on flash consumption. But perhaps more importantly: getting
the current time is now interrupt-safe (it previously could result in a
race condition) and the timer will now be correct when the timer isn't
retrieved for a long duration. Before this commit, a call to `time.Now`
more than 8 minutes after the previous call could result in an incorrect
time.
For more details, see:
https://www.eevblog.com/forum/microcontrollers/correct-timing-by-timer-overflow-count/msg749617/#msg749617