sentry-cocoa: Higher crash rate with EXC_BAD_ACCESS since 6.2.1 -> 7.3.0 update of Sentry

Environment

How do you use Sentry? Sentry SaaS (sentry.io)

Which SDK and version? sentry-cocoa 7.3.0

Steps to Reproduce

Since this is a mem crash and only happens for a fraction of our users I can’t give clear instructions to reproduce this but rather only our config and the info that we updated from 6.2.1 to 7.3.0. Also the first time we have performance monitoring enabled.

options.dsn = "https://..."
options.environment = "AppStore"
options.attachStacktrace = true
options.add(inAppInclude: "InternalFramework")
options.sampleRate = NSNumber(value: 1.0)
options.tracesSampleRate = NSNumber(value: 1.0)
options.beforeSend = { event in
    ... e.g.
    event.tags = event.tags ?? [:]
    event.tags?[".device.model"] = UIDevice.current.model
    event.tags?[".network.status"] = AFNetworkReachabilityManager.shared().currentReachableNetwork
    ...
    var extra: [String: Any] = event.extra ?? [:]
    extra["log"] = log
    event.extra = extra
}

Expected Result

No mem crashes

Actual Result

Excerpts from our Sentry account:

OS Version: iOS 14.7.1 (18G82)
Report Version: 104

Exception Type: EXC_BAD_ACCESS (SIGBUS)
Exception Codes: BUS_NOOP at 0x003f836300000000
Crashed Thread: 14

Application Specific Information:
ersistRepairRecord: > setState: >
Attempted to dereference garbage pointer 0x3f836300000000.

Thread 14 Crashed:
0   CoreFoundation                  0x308ae55dc         CFDictionaryGetValue
1   Foundation                      0x30b288c90         [inlined] _NSSetLongLongValueAndNotify
2   Foundation                      0x30b288c90         _NSSetLongLongValueAndNotify
3   CFNetwork                       0x309ab2358         _CFNetworkHTTPConnectionCacheSetLimit
4   Foundation                      0x30b2b2f84         __NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__
5   Foundation                      0x30b1a0624         -[NSBlockOperation main]
6   Foundation                      0x30b2b53ac         __NSOPERATION_IS_INVOKING_MAIN__
7   Foundation                      0x30b1a02ac         -[NSOperation start]
8   Foundation                      0x30b2b5e50         __NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__
9   Foundation                      0x30b2b58dc         __NSOQSchedule_f
10  libdispatch.dylib               0x3084b1480         _dispatch_block_async_invoke2
11  libdispatch.dylib               0x3084a2818         _dispatch_client_callout
12  libdispatch.dylib               0x3084a5cf0         _dispatch_continuation_pop
13  libdispatch.dylib               0x3084a5380         _dispatch_async_redirect_invoke
14  libdispatch.dylib               0x3084b3fdc         _dispatch_root_queue_drain
15  libdispatch.dylib               0x3084b47d4         _dispatch_worker_thread2
16  libsystem_pthread.dylib         0x3a05a9764         _pthread_wqthread
OS Version: iOS 14.7.1 (18G82)
Report Version: 104

Exception Type: EXC_BAD_ACCESS (SIGBUS)
Exception Codes: BUS_NOOP at 0x003f836300000000
Crashed Thread: 18

Application Specific Information:
ingForObject: > setState: >
Attempted to dereference garbage pointer 0x3f836300000000.

Thread 18 Crashed:
0   CoreFoundation                  0x313c235dc         CFDictionaryGetValue
1   Foundation                      0x3163c6c90         [inlined] _NSSetLongLongValueAndNotify
2   Foundation                      0x3163c6c90         _NSSetLongLongValueAndNotify
3   CFNetwork                       0x314bf0358         _CFNetworkHTTPConnectionCacheSetLimit
4   Foundation                      0x3163f0f84         __NSBLOCKOPERATION_IS_CALLING_OUT_TO_A_BLOCK__
5   Foundation                      0x3162de624         -[NSBlockOperation main]
6   Foundation                      0x3163f33ac         __NSOPERATION_IS_INVOKING_MAIN__
7   Foundation                      0x3162de2ac         -[NSOperation start]
8   Foundation                      0x3163f3e50         __NSOPERATIONQUEUE_IS_STARTING_AN_OPERATION__
9   Foundation                      0x3163f38dc         __NSOQSchedule_f
10  libdispatch.dylib               0x3135ef480         _dispatch_block_async_invoke2
11  libdispatch.dylib               0x3135e0818         _dispatch_client_callout
12  libdispatch.dylib               0x3135e3cf0         _dispatch_continuation_pop
13  libdispatch.dylib               0x3135e3380         _dispatch_async_redirect_invoke
14  libdispatch.dylib               0x3135f1fdc         _dispatch_root_queue_drain
15  libdispatch.dylib               0x3135f27d4         _dispatch_worker_thread2
16  libsystem_pthread.dylib         0x3a9ead764         _pthread_wqthread

Additional Notes

I tried to do a basic white box code review of the differences between these to Sentry version to get an idea what happened. Tried to watch out for any special pitfalls that are a common cause for mem crashes (not copy retaining a block, not retaining an object, etc.).

I found some false positives because the getter is manually implemented (I think, these should not result in any bad ARC code, right?) like these, where I would propose to change them to strong/retain:

/sentry-cocoa-master/Sources/Sentry/include/SentryFramesTracker.h:
   17: @property (nonatomic, assign, readonly) SentryScreenFrames *currentFrames;
/sentry-cocoa-master/Sources/Sentry/Public/PrivateSentrySDKOnly.h:
   61: @property (class, nonatomic, assign, readonly) SentryScreenFrames *currentScreenFrames;

Another, I think, not problematic case where the compiler should choose copy automatically but then again would propose to explicitly add copy:

/sentry-cocoa-master/Sources/Sentry/Public/SentryOptions.h:
  191: @property (nullable, nonatomic) SentryTracesSamplerCallback tracesSampler;

But I also found one instance where a block isn’t copied and this could lead to potential mem problems IMHO, but this only applies to the Tests so this can’t be our reason.

/sentry-cocoa-master/Tests/SentryTests/Networking/NSURLProtocolSwizzle.h:
   11: @property (nullable, nonatomic, strong) ClassCallback registerCallback;
...
   13: @property (nullable, nonatomic, strong) ClassCallback unregisterCallback;

Any help or info would be appreciated 🙏

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 15 (8 by maintainers)

Most upvoted comments

Hi Philipp, sadly I don’t know how to verify this without activating it in production. The crash was chance based and we could never reproduce it with our devices. So I can’t give you any info right now. We will at some point think about activating performance tracking again, but presumably this won’t be the case in the next weeks/months.

We planed to deploy enableAutoPerformanceTracking = false but we will reduce this to the above which only deactivates network tracking. Thanks for your swift response on this manner! We will report back if this improves the situation.