runtime: Tail Call bug in RyuJIT - Incorrect parameters passed
The following critical bug is known and fixed internally (not released) by Microsoft (per private discussions while security was evaluated). In this bug “we” refers to Marc Gravell (@mgravell) and I who tracked down the issue on the Stack Overflow side on the fence. I am posting it here for public visibility and to help users hitting the same issue. I have also created a blog post addressing discovery and severity here: Why you should wait on upgrading to .Net 4.6.
There is an issue in the .Net 4.6 RTM RyuJIT implementation that incorrectly hooks up parameters during some tail call optimizations. The net result of this is that tail call methods can get the wrong parameters passed. Here’s our code exhibiting the issue:
void Set<T>(string key, T val, int? durationSecs, bool sliding, bool broadcastRemoveFromCache = false)
{
SetWithPriority<T>(key, val, durationSecs, sliding, CacheItemPriority.Default);
}
void SetWithPriority<T>(string key, T val, int? durationSecs, bool isSliding, CacheItemPriority priority)
{
key = KeyInContext(key);
RawSet(key, val, durationSecs, isSliding, priority);
}
void RawSet(string cacheKey, object val, int? durationSecs, bool isSliding, CacheItemPriority priority)
{
var absolute = !isSliding && durationSecs.HasValue
? DateTime.UtcNow.AddSeconds(durationSecs.Value)
: Cache.NoAbsoluteExpiration;
var sliding = isSliding && durationSecs.HasValue
? TimeSpan.FromSeconds(durationSecs.Value)
: Cache.NoSlidingExpiration;
HttpRuntime.Cache.Insert(cacheKey, val, null, absolute, sliding, priority, null);
}
The RawSet()
is the last in the chain, and only when optimizations are enabled, is incorrectly optimized. If for example we call Set<T>
with a durationSecs
of 3600, we would expect that value to go all the way down the method chain. This doesn’t happen. Instead, the value passed to the tail method (RawSet
) is seemingly random (our assumption is it pulls some other value from the stack). For example, it may be 30, or 57, or null
.
The net result for us is that items are getting cached for drastically shorter durations, or not at all (null
often gets passed). On Stack Overflow this causes unpredictable local HTTP cache usage and more hits and heavier load to anything populating it. This is a production blocker on deployment for us, and (we believe) should be for anyone else.
We have collapsed the reproduction of this bug into a project you can run locally, available on GitHub. Here’s an example test run (see the repo README for a full description):
I can’t stress how serious of a bug this is, due to the subtleness of the occurrence, the “only production” likelihood given the RELEASE-only nature, and the scary examples we can easily come up with. When the parameters you’re passing aren’t the ones the method is getting, all sanity goes out the window.
What if your method says how much stock to buy? What if it gives dosing information to a patient? What if it tells a plane what altitude to climb to?
This bug is critical, I am posting the issue for several reasons:
- We’re not jerks
- To raise awareness
- To prevent as many people as possible from deploying affected code
- To usher a fix and deployment from Microsoft as soon as possible
If you have already deployed .Net 4.6, our recommendation at this point is to disable RyuJIT immediately.
About this issue
- Original URL
- State: closed
- Created 9 years ago
- Comments: 63 (37 by maintainers)
It will not break process isolation as Ayende said, because reading from another process like that is not allowed in Windows. However, applications in the same application pool can affect each other since an app domain in an application pool cannot provide isolation for this, I think.
There is no way for the process to be just randomly reading some other process memory. You need ReadMemory() calls to do that, and those don’t just happen, even with this issue.
*Hibernating Rhinos Ltd *
Oren Eini* l CEO l *Mobile: + 972-52-548-6969
Office: +972-4-622-7811 *l *Fax: +972-153-4-622-7811
On Tue, Jul 28, 2015 at 4:00 PM, Tim Rayburn notifications@github.com wrote:
For those who have been deep into the internals for this bug, can anyone confirm that the random memory address that is being read to get the “incorrect” value is 100% Absolutely Without A Doubt coming from within the memory space of the .NET Process? If not, as horrible as this is already, this becomes much worse as an angle of attack to read and/or write random memory locations on boxes running .NET Framework 4.6.
Not being limited to the memory space of the process means this could introduce instability into systems running 4.6, and break the process isolation of most shared hosting environments.
I have found some troubles with asm code. Part 1.
The call of
OnLogDuration
fromSetWithPriority
:We can see that
durationSecs
stores at the address[rsp+70h]
. So far, everything looks fine.Part 2.
The call of
OnLogDuration
fromRawSet
:We can see that RyuJIT tries to read
durationSecs
from[rsp+0C8h]
. But:As we can see, the calculated
durationSecs
offset is0C8h
, but the actual offset is0D0h
. In this case, the TCO calculation error is 8 bytes.