The only reason a callback was used, was so that the temporary directory
gets removed once `Build` returns. But that is honestly a really bad
reason: the parent function can simply create a temporary function and
remove it when it returns. It wasn't worth the code complexity that this
callback created.
This change should not cause any observable differences in behavior (it
should be a non-functional change).
I have no reason to do this now, but this unclean code has been bugging
me and I just wanted to get it fixed.
The -x flag prints commands as they are executed. Unfortunately, for the link
command, they were printed too early: before the ThinLTO flags were added.
This is somewhat confusing while debugging.
Before, on the baremetal target or MacOS, we errored if the user
provided configuration to strip debug info.
Ex.
```bash
$ $ tinygo build -o main.go -scheduler=none --no-debug main.go
error: cannot remove debug information: MacOS doesn't store debug info in the executable by default
```
This is a poor experience which results in having OS-specific CLI
behavior. Silently succeeding is good keeping with the Linux philosophy
and less distracting than logging the same without failing.
Signed-off-by: Adrian Cole <adrian@tetrate.io>
Go 1.19 started reformatting code in a way that makes it more obvious
how it will be rendered on pkg.go.dev. It gets it almost right, but not
entirely. Therefore, I had to modify some of the comments so that they
are formatted correctly.
Show the correct error message when trying to strip debug information.
Also, remove the special case for GOOS=linux that was probably dead
code: it was only reachable on baremetal systems which were already
checked before.
The transform package is the more appropriate location for package-level
optimizations, to match `transform.Optimize` for whole-program
optimizations.
This is just a refactor, to make later changes easier to read.
This commit moves the calculation of the package action ID (cache key)
into a separate job. At the moment, this won't have a big effect but
this change is necessary for some future changes I want to make.
This adds the `Version()` function of the `runtime` package which embeds
the go version that was used to build tinygo.
For programs that are compiled with tinygo the version can be overriden
via the:
`tinygo build -ldflags="-X 'runtime.buildVersion=abc'"` flag.
Otherwise it will continue to use the go version with which tinygo was
compiled.
ThinLTO optimizes across LLVM modules at link time. This means that
optimizations (such as inlining and const-propagation) are possible
between C and Go. This makes this change especially useful for CGo, but
not just for CGo. By doing some optimizations at link time, the linker
can discard some unused functions and this leads to a size reduction on
average. It does increase code size in some cases, but that's true for
most optimizations.
I've excluded a number of targets for now (wasm, avr, xtensa, windows,
macos). They can probably be supported with some more work, but that
should be done in separate PRs.
Overall, this change results in an average 3.24% size reduction over all
the tinygo.org/x/drivers smoke tests.
TODO: this commit runs part of the pass pipeline twice. We should set
the PrepareForThinLTO flag in the PassManagerBuilder for even further
reduced code size (0.7%) and improved compilation speed.
This means that it will be possible to generate a Darwin binary on any
platform (Windows, Linux, and MacOS of course), including CGo. Of
course, the resulting binaries can only run on MacOS itself.
The binary links against libSystem.dylib, which is a shared library. The
macos-minimal-sdk repository contains open source header files and
generated symbol stubs so we can generate a stub libSystem.dylib without
copying any closed source code.
This removes the parentHandle argument from the internal calling convention.
It was formerly used to implment coroutines.
Now that coroutines have been removed, it is no longer necessary.
Switching to a shared semaphore allows multi-build operations (compiler tests, package tests, etc.) to use the expected degree of parallelism efficiently.
While refactoring the job runner, the time complexity was also reduced from O(n^2) to O(n+m) (where n is the number of jobs, and m is the number of dependencies).
Instead of storing an increasing version number in relevant packages
(compiler.Version, interp.Version, cgo.Version, ...), read the build ID
from the currently running executable. This has several benefits:
* All changes relevant to the compiled packages are caught.
* No need to bump the version for each change to these packages.
This avoids merge conflicts.
* During development, `go install` is enough. No need to run
`tinygo clean` all the time.
Of course, the drawback is that it might be updated a bit more often
than necessary but I think the overall benefit is big.
Regular release users shouldn't see any difference. Because the tinygo
binary stays the same, the cache works well.
This change uses flock (when available) to acquire locks for build operations.
This allows multiple tinygo processes to run concurrently without building the same thing twice.
This change implements a new "scheduler" for WebAssembly using binaryen's asyncify transform.
This is more reliable than the current "coroutines" transform, and works with non-Go code in the call stack.
runtime (js/wasm): handle scheduler nesting
If WASM calls into JS which calls back into WASM, it is possible for the scheduler to nest.
The event from the callback must be handled immediately, so the task cannot simply be deferred to the outer scheduler.
This creates a minimal scheduler loop which is used to handle such nesting.
This makes sure that the LLVM target features match the one generated by
Clang:
- This fixes a bug introduced when setting the target CPU for all
targets: Cortex-M4 would now start using floating point operations
while they were disabled in C.
- This will make it possible in the future to inline C functions in Go
and vice versa. This will need some more work though.
There is a code size impact. Cortex-M4 targets are increased slightly in
binary size while Cortex-M0 targets tend to be reduced a little bit.
Other than that, there is little impact.
Previously, libclang was run on each fragment (import "C") separately.
However, in regular Go it's possible for later fragments to refer to
types in earlier fragments so they must have been parsed as one.
This commit changes the behavior to run only one C parser invocation for
each Go file.
This commit adds support for musl-libc and uses it by default on Linux.
The main benefit of it is that binaries are always statically linked
instead of depending on the host libc, even when using CGo.
Advantages:
- The resulting binaries are always statically linked.
- No need for any tools on the host OS, like a compiler, linker, or
libc in a release build of TinyGo.
- This also simplifies cross compilation as no cross compiler is
needed (it's all built into the TinyGo release build).
Disadvantages:
- Binary size increases by 5-6 kilobytes if -no-debug is used. Binary
size increases by a much larger margin when debugging symbols are
included (the default behavior) because musl is built with debugging
symbols enabled.
- Musl does things a bit differently than glibc, and some CGo code
might rely on the glibc behavior.
- The first build takes a bit longer because musl needs to be built.
As an additional bonus, time is now obtained from the system in a way
that fixes the Y2038 problem because musl has been a bit more agressive
in switching to 64-bit time_t.
This is really just a preparatory commit for musl support. The idea is
to store not just the archive file (.a) but also an include directory.
This is optional for picolibc but required for musl, so the main purpose
of this commit is the refactor needed for this change.
This commit improves accuracy of the -size=full flag in a big way.
Instead of relying on symbol names to figure out by which package
symbols belong, it will instead mostly use DWARF debug information
(specifically, debug line tables and debug information for global
variables) relying on symbols only for some specific things. This is
much more accurate: it also accounts for inlined functions.
For example, here is how it looked previously when compiling a personal
project:
code rodata data bss | flash ram | package
1902 333 0 0 | 2235 0 | (bootstrap)
46 256 0 0 | 302 0 | github
0 454 0 0 | 454 0 | handleHardFault$string
154 24 4 4 | 182 8 | internal/task
2498 83 5 2054 | 2586 2059 | machine
0 16 24 130 | 40 154 | machine$alloc
1664 32 12 8 | 1708 20 | main
0 0 0 200 | 0 200 | main$alloc
2476 79 0 36 | 2555 36 | runtime
576 0 0 0 | 576 0 | tinygo
9316 1277 45 2432 | 10638 2477 | (sum)
11208 - 48 6548 | 11256 6596 | (all)
And here is how it looks now:
code rodata data bss | flash ram | package
------------------------------- | --------------- | -------
1509 0 12 23 | 1521 35 | (unknown)
660 0 0 0 | 660 0 | C compiler-rt
58 0 0 0 | 58 0 | C picolibc
0 0 0 4096 | 0 4096 | C stack
174 0 0 0 | 174 0 | device/arm
6 0 0 0 | 6 0 | device/sam
598 256 0 0 | 854 0 | github.com/aykevl/ledsgo
320 24 0 4 | 344 4 | internal/task
1414 99 24 2181 | 1537 2205 | machine
726 352 12 208 | 1090 220 | main
3002 542 0 36 | 3544 36 | runtime
848 0 0 0 | 848 0 | runtime/volatile
70 0 0 0 | 70 0 | time
550 0 0 0 | 550 0 | tinygo.org/x/drivers/ws2812
------------------------------- | --------------- | -------
9935 1273 48 6548 | 11256 6596 | total
There are some notable differences:
* Odd packages like main$alloc and handleHardFault$string are gone,
instead their code is put in the correct package.
* C libraries and the stack are now included in the list, they were
previously part of the (bootstrap) pseudo-package.
* Unknown bytes are slightly reduced. It should be possible to reduce
it significantly more in the future: most of it is now caused by
interface invoke wrappers.
* Inlined functions are now correctly attributed. For example, the
runtime/volatile package is normally entirely inlined.
* There is no difference between (sum) and (all) anymore. A better
code size algorithm now counts the code/data sizes correctly.
* And last (but not least) there is a stylistic change: the table now
looks more like a table. Especially the summary should be clearer
now.
Future goals:
* Improve debug information so that the (unknown) pseudo-package is
reduced in size or even eliminated altogether.
* Add support for other file formats, most importantly WebAssembly.
* Perhaps provide a way to expand this report per file, or in a
machine-readable format like JSON or CSV.
This commit has a few related changes:
* It sets the optsize attribute immediately in the compiler instead of
adding it to each function afterwards in a loop. This seems to me
like the more appropriate way to do it.
* It centralizes setting the optsize attribute in the transform
package, to make later changes easier.
* It sets the optsize in a few more places: to runtime.initAll and to
WebAssembly i64 wrappers.
This commit does not affect the binary size of any of the smoke tests,
so should be risk-free.
For example, the following did not work before but does work with this
change:
// int add(int a, int b) {
// return a + b;
// }
import "C"
func main() {
println("add:", C.add(3, 5))
}
Even better, the functions in the header are compiled together with the
rest of the Go code and so they can be optimized together! Currently,
inlining is not yet allowed but const-propagation across functions
works. This should be improved in the future.
Instead of keeping a slice of jobs to run, let the runJobs function
determine which jobs should be run by investigating all dependencies.
This has two benefits:
- The code is somewhat cleaner, as no 'jobs' slice needs to be
maintained while constructing the dependency graph.
- Eventually, some jobs might not be required by any dependency.
While it's possible to avoid adding them to the slice, the simpler
solution is to build a new slice from the dependencies which will
only include required dependencies by design.
This change adds support for the ESP32-C3, a new chip from Espressif. It
is a RISC-V core so porting was comparatively easy.
Most peripherals are shared with the (original) ESP32 chip, but with
subtle differences. Also, the SVD file I've used gives some
peripherals/registers a different name which makes sharing code harder.
Eventually, when an official SVD file for the ESP32 is released, I
expect that a lot of code can be shared between the two chips.
More information: https://www.espressif.com/en/products/socs/esp32-c3
TODO:
- stack scheduler
- interrupts
- most peripherals (SPI, I2C, PWM, etc)
Stripping debug information at link time also allows relocation
compression (aka linker relaxations). Keeping debug information at
compile time and optionally stripping it at link time has some
advantages:
* Automatic stack sizes on Cortex-M rely on the presence of debug
information.
* Some parts of the compiler now rely on the presence of debug
information for proper diagnostics.
* It works better with the cache: there is no distinction between
debug and no-debug builds.
* It makes it easier (or possible at all) to enable debug information
in the wasi-libc library without big downsides.
Static libraries should be added at the end of the linker command, after
all object files. If that isn't done, that's _usually_ not a problem,
unless there are duplicate symbols. In that case, weird dependency
issues can arise. To solve that, object files (that may include symbols
to override symbols in the library) should be listed first on the
command line and then the static libraries should be listed.
This fixes an issue with overriding some symbols in wasi-libc.
On ARM, the stack has to be aligned to 8 bytes on function calls, but
not necessarily within a function. Leaf functions can take advantage of
this by not keeping the stack aligned so they can avoid pushing one
register. However, because regular functions might expect an aligned
stack, the interrupt controller will forcibly re-align the stack when an
interrupt happens in such a leaf function (controlled by the STKALIGN
flag, defaults to on). This means that stack size calculation (as used
in TinyGo) needs to make sure this extra space for stack re-alignment is
available.
This commit fixes this by aligning the stack size that will be used for
new goroutines.
Additionally, it increases the stack canary size from 4 to 8 bytes, to
keep the stack aligned. This is not strictly necessary but is required
by the AAPCS so let's do it anyway just to be sure.