Some vector registers must be preserved across calls, but this wasn't
happening on Linux and MacOS. When I added support for windows/arm64, I
saw that it required these vector registers to be preserved and assumed
this was Windows deviating from the standard calling convention. But
actually, Windows was just implementing the standard calling convention
and the bug was on Linux and MacOS.
This commit fixes the bug on Linux and MacOS and at the same time merges
the Go and assembly files as they no longer need to be separate.
This is a small change to make it easier to support architectures that
need to restore more than just the sp and pc registers. In particular,
it is needed for the AVR architecture that needs to restore the frame
pointer (Y register).