clrmd: Unable to analyze 4GB dumps of 32-bit processes
It seems ClrMD is not able to process 32-bit memory dumps larger than 2GB due to inconsistent address conversion between coreclr and ClrMD.
As @leculver and @mikem8361 already discussed in #280, using ulong
types for address values may lead to unexpected behavior on particular platforms.
For your information, I am developing for Tizen wearable devices which run 32-bit (armel) .NET processes on 64-bit Linux kernels. Unlike normal 32-bit processes on other platforms, each user process has 4GB of address space.
As far as I can see in the DAC implementation of coreclr, most of exposed address values are in CLRDATA_ADDRESS
types, so we can safely assume them as sign-extended according to this comment. When the values are passed to ClrMD however, they are (implicitly) converted into ulong
type (instead of long
) which seems not correct for negative CLRDATA_ADDRESS
’es.
For example, when the DAC calls into ClrMD using DataTargetAdapter::ReadVirtual()
and the value of address
is larger than 0x7FFFFFFF (let’s say 0xFFCCBBAA), the ulong
value passed to DacDataTargetWrapper
appears to be super larger than expected (0xFFFFFFFFFFCCBBAA).
Is this conversion intentional (should I assume sign-extension for ulong
)? Or am I missing something?
In my experiment, my sample code (just printing a managed stakctrace) simply worked by modifying the implementation of TO_CDADDR
macro in the coreclr runtime. I don’t think this would be a right choice however. Changing all occurrences of ulong
in the DAC interfaces of ClrMD to long
also looks bad since we will need a lot of work to be done and additional maths in the code.
About this issue
- Original URL
- State: closed
- Created 5 years ago
- Comments: 23 (23 by maintainers)
@swift-kim: Can you test whether PR #535 fixes the issue or not? I have no way to test whether my changes were correct or if I missed something.
Sorry for the delay. I am taking a look at this now and I will likely have a pull request ready (for 2.0) by tonight or tomorrow for review.
I’d like to approach this in a more methodical way to fix it for the entire library and not try to spot-fix the specific locations that make it work for this specific problem. I won’t be able to test this fully on arm64 this week, but hopefully you can try it out and let me know if it works.
You might be talking about different platforms (in case it makes a difference for sign extension expectations)