This patch adds support for generating GOOS=darwin GOARCH=arm64
binaries. This means that it will become possible to run `go test` on
recent Macs, for example.
This results in smaller and likely more efficient code. It does require
some architecture specific code for each architecture, but I've kept the
amount of code as small as possible.