Implement defer in a different way, which results in smaller binaries.
The binary produced from testdata/calls.go (the only test case with
defer) is reduced a bit in size, but the savings in bytes greatly vary
by architecture:
Cortex-M0: -96 .text / flash
WebAssembly: -215 entire file
Linux x64: -32 .text
Deferred functions in TinyGo were implemented by creating a linked list
of struct objects that contain a function pointer to a thunk, a pointer
to the next object, and a list of parameters. When it was time to run
deferred functions, a helper runtime function called each function
pointer (the thunk) with the struct pointer as a parameter. This thunk
would then in turn extract the saved function parameter from the struct
and call the real function.
What this commit changes, is that the loop to call deferred functions is
moved into the end of the function (practically inlining it) and
replacing the thunks with direct calls inside this loop. This makes it
much easier for LLVM to perform all kinds of optimizations like inlining
and dead argument elimination.
The default priority is 0 (highest) which is reserved by the SoftDevice.
For normal operation the exact priority level doesn't matter, only the
relative priority matters. So this change makes the code compatible with
the SoftDevice without actually changing the behavior.
This commit changes many things:
* Most interface-related operations are moved into an optimization
pass for more modularity. IR construction creates pseudo-calls which
are lowered in this pass.
* Type codes are assigned in this interface lowering pass, after DCE.
* Type codes are sorted by usage: types more often used in type
asserts are assigned lower numbers to ease jump table construction
during machine code generation.
* Interface assertions are optimized: they are replaced by constant
false, comparison against a constant, or a typeswitch with only
concrete types in the general case.
* Interface calls are replaced with unreachable, direct calls, or a
concrete type switch with direct calls depending on the number of
implementing types. This hopefully makes some interface patterns
zero-cost.
These changes lead to a ~0.5K reduction in code size on Cortex-M for
testdata/interface.go. It appears that a major cause for this is the
replacement of function pointers with direct calls, which are far more
susceptible to optimization. Also, not having a fixed global array of
function pointers greatly helps dead code elimination.
This change also makes future optimizations easier, like optimizations
on interface value comparisons.
This can be used in the future to trigger garbage collection. For now,
it provides a more useful error message in case the heap is completely
filled up.
Make sure every to-be-implemented GC can use the same interface. As a
result, a 1MB chunk of RAM is allocated on Unix systems on init instead
of allocating on demand.
* Use 64-bit integers on 64-bit platforms, just like gc and gccgo:
https://golang.org/doc/go1.1#int
* Do not use a separate length type. Instead, use uintptr everywhere a
length is expected.
Let the standard library think that it is compiling for js/wasm.
The most correct way of supporting bare metal Cortex-M targets would be
using the 'arm' build tag and specifying no OS or an 'undefined' OS
(perhaps GOOS=noos?). However, there is no build tag for specifying no
OS at all, the closest possible is GOOS=js which makes very few
assumptions.
Sadly GOOS=js also makes some assumptions: it assumes to be running with
GOARCH=wasm. This would not be such a problem, just add js, wasm and arm
as build tags. However, having two GOARCH build tags leads to an error
in internal/cpu: it defines variables for both architectures which then
conflict.
To work around these problems, the 'arm' target has been renamed to
'tinygo.arm', which should work around these problems. In the future, a
GOOS=noos (or similar) should be added which can work with any
architecture and doesn't implement OS-specific stuff.
When doing a slice operation on a slice, use the capacity value instead
of the length. Of course, for strings and arrays, the slice operation
checks the length because there is no capacity. But according to the
spec, this check should be based on cap for slice instead of len:
> For slices, the upper index bound is the slice capacity cap(a) rather
> than the length.
https://golang.org/ref/spec#Slice_expressions
Fixes: https://github.com/aykevl/tinygo/issues/65
It is stubbed out currently, but may be useful in the future.
Note that this function is implemented for a future change to the init
system, it is not yet useful.
It is allowed to index with an int64 even on a 32-bit platform, so we
have to handle that case. But make sure the normal case isn't penalized
by using 32-bit numbers when possible.
Bigger hashmaps (size > 8) use multiple buckets in a chain. The lookup
code looked at multiple buckets for a lookup, but kept checking the
first bucket for key equality.
Let each target handle its own initialization/finalization sequence
instead of providing one in the runtime with hooks for memory
initialization etc. This is much more flexible although it causes a
little bit of code duplication.
A few changes to make sure compiler-rt is correctly compiled (and
doesn't include host headers, for example).
This improves support for AVR, but it still doesn't work. Compiler-rt
itself doesn't really work for AVR either.
This code:
foo & 0xffffff
Is equivalent to this code:
foo % 0x1000000
However, to drop the high 8 bits, this calculation was used:
foo % 0xffffff
This is far more expensive (and incorrect), as it needs an actual modulo
operation which increases code size and probably reduces speed on a
Cortex-M4 and needs library functions for a Cortex-M0 increasing code
size by a much bigger amount.
This is one step towards removing unnecessary special casts in most
cases. It is also part of removing as much magic as possible from the
compiler (the pragma is explicit, the special name is not).