Closure variables are allocated in a parent function and are thus never
nil. Don't do a nil check before reading or modifying the value.
This commit results in a slight reduction in code size in some test
cases: calls.go, channel.go, goroutines.go, json.go, sort.go -
presumably wherever closures are used.
These two passes are related, but can definitely work independently.
Which is what this change does: it splits the two passes. This should
make it easier to change these two new passes in the future.
This change now also enables slightly better testing by testing these
two passes independently. In particular, the reflect lowering pass got
some actual tests: it was barely unit-tested before.
I have verified that this doesn't really change code size, at least not
on the microbit target. Two tests do change, but in a very minor way
(and in opposite direction).
In many cases, position information is not stored in Go SSA instructions
because they don't exit directly in the source code. This includes
implicit type conversions, implicit returns at the end of a function,
the creation of a (hidden) slice when calling a variadic function, and
many other cases. I'm not sure where this information is supposed to
come from, but this patch takes the value (usually) from the value the
instruction refers to. This seems to work well for these implicit
conversions.
I've also added a few extra tests to the heap-to-stack transform pass,
of which one requires this improved position information.
This allows better escape analysis even without being able to see the
entire program. This makes the stack allocation test case more complete
but probably won't have much of an effect outside of that (as the
compiler is able to infer these attributes in the whole-program
functionattrs pass).
This flag, if set, is a regexp for function names. If there are heap
allocations in the matching function names, these heap allocations will
be printed with an explanation why the heap allocation exists (and why
the object can't be stack allocated).
This allows for adding more advanced tests, for example tests that use
the compiler package so that test sources can be written in Go instead
of LLVM IR.
There is no good reason for func values to refer to interface type
codes. The only thing they need is a stable identifier for function
signatures, which is easily created as a new kind of globals. Decoupling
makes it easier to change interface related code.
The LLVM CoroFrame pass appears to be tripping over this zero-sized
alloca. Therefore, do what the runtime would do: return a pointer to
runtime.zeroSizedAlloc. Or just don't deal with this case. But don't
emit a zero sized alloca to avoid this LLVM bug.
More information: https://bugs.llvm.org/show_bug.cgi?id=49916
The func-lowering pass has started to fail in the dev branch, probably
as a result of replacing the ConstPropagation pass with the IPSCCP pass.
This commit makes the code a bit more robust and should be able to
handle all possible cases (not just ptrtoint).
The constant propagation pass is removed in LLVM 12, so this pass needs
to be replaced anyway. The direct replacement would be the SCCP (sparse
conditional constant propagation) pass, but perhaps a better replacement
is the IPSCCP pass, which is an interprocedural version of the SCCP
pass and propagates constants across function calls if possible.
This is not always a code size reduction, but it appears to reduce code
size in a majority of cases. It certainly reduces code size in almost
all WebAssembly tests I did.
This should result in a small compile time reduction for incremental
builds, somewhere around 5-9%.
This commit, while small, required many previous commits to not regress
binary size. Right now binary size is basically identical with very few
changes in size (the only baremetal program that changed in size did so
with a 4 byte increase).
This commit is one extra step towards doing as much work as possible in
the parallel and cached package build step, out of the serial LTO phase.
Later improvements in this area have this change as a prerequisite.
Sometimes, LLVM may rename named structs when merging modules.
Therefore, we can't rely on typecodeID structs to retain their struct
names.
This commit changes the interface lowering pass to not rely on these
names. The interp package does however still rely on this name, but I
hope to fix that in the future.
This simplifies future changes. While the move itself is very simple, it
required some other changes to a few transforms that create new
functions to add the optsize attribute manually. It also required
abstracting away the optimization level flags (based on the -opt flag)
so that it can easily be retrieved from the config object.
This commit does not impact binary size on baremetal and WebAssembly.
I've seen a few tests on linux/amd64 grow slightly in size, but I'm not
too worried about those.
In rare cases the signature might change as a result of LLVM renaming
some named struct types when multiple LLVM modules are merged. The
easiest workaround is to detect such mismatched signatures and adding a
bitcast: this should be safe as the underlying data is effectively of
the same type.
This commit adds a new transform that converts reflect Implements()
calls to runtime.interfaceImplements. At the moment, the Implements()
method is not yet implemented (how ironic) but if the value passed to
Implements is known at compile time the method call can be optimized to
runtime.interfaceImplements to make it a regular interface assert.
This commit is the last change necessary to add basic support for the
encoding/json package. The json package is certainly not yet fully
supported, but some trivial objects can be converted to JSON.
Previously there was code to avoid impossible type asserts but it wasn't
great and in fact was too aggressive when combined with reflection.
This commit improves this by checking all types that exist in the
program that may appear in an interface (even struct fields and the
like) but without creating runtime.typecodeID objects with the type
assert. This has two advantages:
* As mentioned, it optimizes impossible type asserts away.
* It allows methods on types that were only asserted on (in
runtime.typeAssert) but never used in an interface to be optimized
away using GlobalDCE. This may have a cascading effect so that other
parts of the code can be further optimized.
This sometimes massively improves code size and mostly negates the code
size regression of the previous commit.
This distinction was useful before when reflect wasn't properly
supported. Back then it made sense to only include method sets that were
actually used in an interface. But now that it is possible to get to
other values (for example, by extracting fields from structs) and it is
possible to turn them back into interfaces, it is necessary to preserve
all method sets that can possibly be used in the program in a type
assert, interface assert or interface method call.
In the future, this logic will need to be revisited again when
reflect.New or reflect.Zero gets implemented.
Code size increases a bit in some cases, but usually in a very limited
way (except for one outlier in the drivers smoke tests). The next commit
will improve the situation significantly.
This commit switches from the previous behavior of compiling the whole
program at once, to compiling every package in parallel and linking the
LLVM bitcode files together for further whole-program optimization.
This is a small performance win, but it has several advantages in the
future:
- There are many more things that can be done per package in parallel,
avoiding the bottleneck at the end of the compiler phase. This
should speed up the compiler futher.
- This change is a necessary step towards a non-LTO build mode for
fast incremental builds that only rebuild the changed package, when
compiler speed is more important than binary size.
- This change refactors the compiler in such a way that it will be
easier to inspect the IR for one package only. Inspecting this IR
will be very helpful for compiler developers.
This optimization level wasn't working before because some passes expect
some globals to be cleaned up afterwards. Cleaning these globals is
easy, just add the pass necessary for it. This shouldn't reduce the
usefulness of the -opt=0 build flag as most optimizations are still
skipped.
A common error is when someone tries to export a blocking function. This
is not possible with the coroutines scheduler. Previously, it resulted
in an error like this:
panic: trying to make exported function async: messageHandler
With this change, the error is much better and shows where it comes from
exactly:
/home/ayke/tmp/export-async.go:8: blocking operation in exported function: messageHandler
traceback:
messageHandler
/home/ayke/tmp/export-async.go:9:5
main.foo
/home/ayke/tmp/export-async.go:15:2
runtime.chanSend
/home/ayke/src/github.com/tinygo-org/tinygo/src/runtime/chan.go:494:12
This should make it easier to identify and fix the problem. And it
avoids a compiler panic, which is a really bad way of showing
diagnostics.
Moving settings to a separate config struct has two benefits:
- It decouples the compiler a bit from other packages, most
importantly the compileopts package. Decoupling is generally a good
thing.
- Perhaps more importantly, it precisely specifies which settings are
used while compiling and affect the resulting LLVM module. This will
be necessary for caching the LLVM module.
While it would have been possible to cache without this refactor, it
would have been very easy to miss a setting and thus let the
compiler work with invalid/stale data.
This fixes an issue where a normal suspending call followed by a plain tail call would result in the tail return value being written to the return pointer of the normal suspending call.
This is fixed by saving the return pointer at the start of the function and restoring it before initiating a plain tail call.
Unfortunately, the .rodata section can't be stored in flash. Instead, an
explicit .progmem section should be used, which is supported in LLVM as
address space 1 but not exposed to normal programs.
Eventually a pass should be written that converts trivial const globals
of which all loads are visible to be in addrspace 1, to get the benefits
of storing those globals directly in ROM.
This can be useful to test improvements in LLVM master and to make it
possible to support LLVM 11 for the most part already before the next
release. That also allows catching LLVM bugs early to fix them upstream.
Note that tests do not yet pass for this LLVM version, but the TinyGo
compiler can be built with the binaries from apt.llvm.org (at the time
of making this commit).
* initial commit for WASI support
* merge "time" package with wasi build tag
* override syscall package with wasi build tag
* create runtime_wasm_{js,wasi}.go files
* create syscall_wasi.go file
* create time/zoneinfo_wasi.go file as the replacement of zoneinfo_js.go
* add targets/wasi.json target
* set visbility hidden for runtime extern variables
Accodring to the WASI docs (https://github.com/WebAssembly/WASI/blob/master/design/application-abi.md#current-unstable-abi),
none of exports of WASI executable(Command) should no be accessed.
v0.19.0 of bytecodealliance/wasmetime, which is often refered to as the reference implementation of WASI,
does not accept any exports except functions and the only limited variables like "table", "memory".
* merge syscall_{baremetal,wasi}.go
* fix js target build
* mv wasi functions to syscall/wasi && implement sleepTicks
* WASI: set visibility hidden for globals variables
* mv back syscall/wasi/* to runtime package
* WASI: add test
* unexport wasi types
* WASI test: fix wasmtime path
* stop changing visibility of runtime.alloc
* use GOOS=linux, GOARCH=arm for wasi target
Signed-off-by: mathetake <takeshi@tetrate.io>
* WASI: fix build tags for os/runtime packages
Signed-off-by: mathetake <takeshi@tetrate.io>
* run WASI test only on Linux
Signed-off-by: mathetake <takeshi@tetrate.io>
* set InternalLinkage instead of changing visibility
Signed-off-by: mathetake <takeshi@tetrate.io>
This is a big change that will determine the stack size for many
goroutines automatically. Functions that aren't recursive and don't call
function pointers can in many cases have an automatically determined
worst case stack size. This is useful, as the stack size is usually much
lower than the previous hardcoded default of 1024 bytes: somewhere
around 200-500 bytes is common.
A side effect of this change is that the default stack sizes (including
the stack size for other architectures such as AVR) can now be changed
in the config JSON file, making it tunable per application.
It appears that LLVM is turning bitcasts into 0-index GEPs.
This caused stuff to not be tracked, resulting in use-after-free issues.
This solution is sub-optimal, but is the most reasonable solution I could come up with without redesigning the stack slots pass.
This option was broken for a long time, in part because we didn't test
for it. This commit fixes that and adds a test to make sure it won't
break again unnoticed.
It was `avr-atmel-none`, which is incorrect. It must be
`avr-unknown-unknown`.
Additionally, there is no reason to specify the target triple per chip,
it can be done for all AVR chips at once as it doesn't vary like
Cortex-M chips.
I ran into an issue where I did a method call on a nil interface and it
resulted in a HardFault. Luckily I quickly realized what was going on so
I could fix it, but I think undefined behavior is definitely the wrong
behavior in this case. This commit therefore changes such calls to cause
a nil panic instead of introducing undefined behavior.
This does have a code size impact. It's relatively minor, much lower
than I expected. When comparing the before and after of the drivers
smoke tests (probably the most representative sample available), I found
that most did not change at all and those that did change, normally not
more than 100 bytes (16 or 32 byte changes are typical).
Right now the pattern is the following:
switch typecode {
case 1:
call method 1
case 2:
call method 2
default:
nil panic
}
I also tried the following (in the hope that it would be easier to
optimize), but it didn't really result in a code size reduction:
switch typecode {
case 1:
call method 1
case 2:
call method 2
case 0:
nil panic
default:
unreachable
}
Some code got smaller, while other code (the majority) got bigger. Maybe
this can be improved once range[1] is finally allowed[2] on function
parameters, but it'll probably take a while before that is implemented.
[1]: https://llvm.org/docs/LangRef.html#range-metadata
[2]: https://github.com/rust-lang/rust/issues/50156
This is a common case, but it also complicates the code. Removing this
special case does have a negative effect on code size in rare cases, but
I don't think it's worth keeping around (and possibly causing bugs) for
such uncommon cases.
This should not result in functional changes, although the output (as
stated above) sometimes changes a little bit.
This commit fixes errors like the following:
inlinable function call in a function with debug info must have a !dbg location
call void @runtime.nilPanic(i8* undef, i8* null)
inlinable function call in a function with debug info must have a !dbg location
%24 = call fastcc %runtime._interface @"(*github.com/vugu/vugu/domrender.JSRenderer).render$1"(%"github.com/vugu/vugu.VGNode"** %19, i32 %22, i32 %23, i8* %15, i8* undef)
error: optimizations caused a verification failure
Not all instructions had a debug location, which apparently caused
issues for the inliner.
Previously, the function value lowering pass had special cases for when there were 0 or 1 function implementations.
However, the results of the pass were incorrect in both of these cases.
This change removes the specializations and fixes the transformation.
In the case that there was a single function implementation, the compiler emitted a select instruction to obtain the function pointer.
This selected between null and the implementing function pointer.
While this was technically correct, it failed to eliminate indirect function calls.
This prevented discovery of these calls by the coroutine lowering pass, and caused async function calls to be passed through unlowered.
As a result, the generated code had undefined behavior (usually resulting in a segfault).
In the case of no function implementations, the lowering code was correct.
However, the lowering code was not run.
The discovery of function signatures was accomplished by scanning implementations, and when there were no implementations nothing was discovered or lowered.
For maintainability reasons, I have removed both specializations rather than fixing them.
This substantially simplifies the code, and reduces the amount of variation that we need to worry about for testing purposes.
The IR now generated in the cases of 0 or 1 function implementations can be efficiently simplified by LLVM's optimization passes.
Therefore, there should not be a substantial regression in terms of performance or machine code size.
The coroutine lowering pass had issues where it iterated over maps, sometimes resulting in non-deterministic output.
This change removes many of the maps and ensures that the transformations are deterministic.
This commit also adds a bit of version independence, in particular for
external commands. It also adds the LLVM version to the `tinygo version`
command, which might help while debugging.
This hack was originally introduced in
https://github.com/tinygo-org/tinygo/pull/251 to fix an escape analysis
regression after https://github.com/tinygo-org/tinygo/pull/222
introduced nil checks. Since a new optimization in LLVM (see
https://reviews.llvm.org/D60047) this hack is not necessary anymore and
can be removed.
I've compared all regular tests and smoke tests before and after to
check the size. In most cases this change was an improvement although
there are a few regressions.
It appears that LLVM can sometimes recognize that multiple calls to
runtime.interfaceMethod can be merged into one. When that happens, the
interface lowering pass shows an error as it didn't expect that
situation.
Luckily the fix is very easy.
Panics are bad for usability: whenever something breaks, the user is
shown a (not very informative) backtrace. Replace it with real error
messages instead, that even try to display the Go source location.
This avoids problems with goroutines in WebAssembly, and is generally a
good thing. It fixes some cases of the following problem:
LLVM ERROR: Coroutines cannot handle non static allocas yet