go: runtime, cgo: programs using Cocoa/OpenGL/Metal APIs on macOS exhibit problems at tip not seen in 1.19.4

What version of Go are you using (go version)?

$ go version
go version go1.20rc1 darwin/arm64

Does this issue reproduce with the latest release?

Not with the latest stable release.

This appears to be a regression that happens only with Go 1.20 RC 1 and at tip (as of yesterday), but doesn’t happen at all with Go 1.19.4 or any older stable Go release.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE=""
GOARCH="arm64"
GOBIN=""
GOCACHE="/Users/gopher/Library/Caches/go-build"
GOENV="/Users/gopher/Library/Application Support/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="arm64"
GOHOSTOS="darwin"
GOINSECURE=""
GOMODCACHE="/Users/gopher/go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="darwin"
GOPATH="/Users/gopher/go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/darwin_arm64"
GOVCS=""
GOVERSION="go1.20rc1"
GCCGO="gccgo"
AR="ar"
CC="clang"
CXX="clang++"
CGO_ENABLED="1"
GOMOD="/dev/null"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -fdebug-prefix-map=/var/folders/s_/5sjqzr0j6xggz_xtzmq_8r4m00jwcj/T/go-build1742320560=/tmp/go-build -gno-record-gcc-switches -fno-common"

Generally a close-to-default install of Go 1.20 RC 1 on an M1-based Mac running latest macOS (13.0.1/22A400) and Xcode (14.1/14B47b).

What did you do?

I tried running various Go programs that use Cocoa and either OpenGL or Metal APIs via cgo to open a window and render graphics. This problem affects all of them in the same way. The smallest way to reproduce I can share at this time are the simple example programs in the go-gl org:

$ cd $(mktemp -d)
$ git clone https://github.com/go-gl/example && cd example
$ go run ./gl21-cube
$ go run ./gl41core-cube

There’s no problem running simple cgo programs that don’t use the Cocoa/OpenGL/Metal APIs.

What did you expect to see?

Normal program execution, no warnings or errors, just like with Go 1.19.4 or older.

What did you see instead?

Almost always, there are warnings/log messages printed including:

2022-12-12 11:40:49.511 gl21-cube[64718:25499557] +[NSXPCSharedListener endpointForReply:withListenerName:replyErrorCode:]: an error occurred while attempting to obtain endpoint for listener 'ClientCallsAuxiliary': Connection invalid
2022-12-12 11:40:49.518 gl21-cube[64718:25499557] Error received in message reply handler: Connection invalid
2022-12-12 11:40:49.518 gl21-cube[64718:25499593] Connection Invalid error for service com.apple.hiservices-xpcservice.

(Those log messages may show up only after the Cocoa window is selected.)

Sometimes it also prints:

FALLBACK (log once): Fallback to SW vertex processing because buildPipelineState failed
FALLBACK (log once): Fallback to SW fragment processing because buildPipelineState failed
UNSUPPORTED (log once): UNEXPECTED/FATAL?: buildPipelineState failed to build fragment-fallback PSO, m_disable_code: 1001000
FALLBACK (log once): Fallback to SW vertex processing, m_disable_code: 1000
FALLBACK (log once): Fallback to SW fragment processing, m_disable_code: 1000000

Also observed:

2022/12/11 14:08:22 Compiler encountered XPC_ERROR_CONNECTION_INVALID (is the OS shutting down?)

(The OS was not shutting down. Restarting had no effect.)

It’s not completely deterministic: sometimes the program will exit due to an error, or fail to render graphics properly if it keeps running. However, running the same program with Go 1.19.4 will work okay, and then re-running it with Go 1.20 RC 1 will also work (still with warnings, but graphics will render normally). From a quick look, it seems that modifying a shader source will cause it to start to fail again, so it’s possible running it with Go 1.19.4 causes those to be built successfully, cached and reused.

This might very well be a problem with the Go program accessing those macOS APIs in an unsafe way, or a problem in macOS itself, but it only started happening at tip and doesn’t happen when reverting to an older stable Go version.

CC @golang/runtime.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 21 (10 by maintainers)

Commits related to this issue

Most upvoted comments

After re-reading #56784 I realized this likely has to do with fork/new process being involved during shader compilation.

In case it ends up being helpful, here’s a more self-contained repro that uses the Metal API:

package main

/*
#cgo CFLAGS: -x objective-c
#cgo LDFLAGS: -framework Metal -framework CoreGraphics -framework Foundation

#import <Metal/Metal.h>

void CompileMetalShader() {
	id<MTLDevice> device = MTLCreateSystemDefaultDevice();
	if (!device) {
		printf("no Metal device\n");
		return;
	}

	NSError * err;
	id<MTLLibrary> library = [device newLibraryWithSource:@"// Empty shader 123."
	                                              options:NULL
	                                                error:&err];
	if (err) {
		printf("newLibraryWithSource error: %s\n", err.localizedDescription.UTF8String);
		return;
	}

	printf("ok\n");
}
*/
import "C"

func main() {
	C.CompileMetalShader()

	// Output (with CL 451735, with shader source that hasn't been previously compiled):
	// newLibraryWithSource error: Compiler encountered XPC_ERROR_CONNECTION_INVALID (is the OS shutting down?)

	// Output (without CL 451735):
	// ok
}

Edit: Added -framework Foundation; it’s needed for linking to succeed in some, albeit not all, macOS environments.

Thank you for the quick confirmation.

https://github.com/golang/go/issues/57419 is another issue that appears to be caused by go.dev/cl/451735

Reproducer:

package main

// #cgo darwin LDFLAGS: -framework PCSC
// #include <PCSC/winscard.h>
// #include <PCSC/wintypes.h>
import "C"
import "fmt"

func main() {
	var ctx C.SCARDCONTEXT
	rc := int64(C.SCardEstablishContext(C.SCARD_SCOPE_SYSTEM, nil, nil, &ctx))
	if rc < 0 {
		// Fix overflow.
		rc += (1 << 32)
	}
	fmt.Printf("0x%08x\n", rc)
	if rc == C.SCARD_S_SUCCESS {
		C.SCardReleaseContext(ctx)
	}
}

Without cl/451735 or using Go 1.19 and earlier, this prints 0x00000000 (SCARD_S_SUCCESS). On Go 1.20rc1 it prints 0x8010001d (SCARD_E_NO_SERVICE)

Tentatively adding release-blocker since this might be preventing some cgo programs from running on a first class port.

My app now works perfectly after reinstalling Go 1.19.4 and rebuilding.

On Dec 13, 2022, at 3:43 PM, Dmitri Shuralyov @.***> wrote: @rsc https://github.com/rsc Sure, I’ll run a bisect and see if I can spot what Go commit introduces this, thanks. (It should be easy as long as this stays reproducible for me.)

@gtownsend https://github.com/gtownsend Thanks. In your case, does compiling the Go program with Go 1.19.4 make any difference? For me, when I tried it last, the problem would immediately go away whenever the program was built with 1.19.4.

I’ll also see if today’s macOS 13.1 release brings any new differences.

— Reply to this email directly, view it on GitHub https://github.com/golang/go/issues/57263#issuecomment-1349938543, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABVOAQBXKK6FDHJXGYKSUN3WND3ZXANCNFSM6AAAAAAS4GDURU. You are receiving this because you were mentioned.

Can you bisect to when it happened?