runtime: JIT\Regression\CLR-x86-JIT\V1-M11-Beta1\b48990\b48990a regression

This regression is caused by dotnet/coreclr#17996.

if we have such IL:

    [ 0]  17 (0x011) ldloc.0
    [ 1]  18 (0x012) ldc.r4 2.0000000000000000
    [ 2]  23 (0x017) div
    [ 1]  24 (0x018) conv.r4 <- cast from double to float.
    [ 1]  25 (0x019) ldc.r4 0.00000000000000000
    [ 2]  30 (0x01e) ceq
    [ 1]  32 (0x020) stloc.3

then before the change we had:

               [000017] ------------              *  STMT      void  (IL 0x011...  ???)
               [000013] ------------              |     /--*  CNS_DBL   float  0.00000000000000000
               [000014] ------------              |  /--*  EQ        int
               [000012] ------------              |  |  \--*  CAST      float <- float
               [000010] ------------              |  |     |  /--*  CNS_DBL   float  2.0000000000000000
               [000011] ------------              |  |     \--*  DIV       float
               [000009] ------------              |  |        \--*  LCL_VAR   float  V00 loc0
               [000016] -A----------              \--*  ASG       int
               [000015] D------N----                 \--*  LCL_VAR   int    V03 loc3

but now we do not have this cast [000012].

This cast protected us on x86 with x87 floating point stack. The next tree in this test is:

    [ 0]  33 (0x021) ldloc.3
    [ 1]  34 (0x022) brtrue.s

               [000022] ------------              *  STMT      void  (IL 0x021...  ???)
               [000021] ------------              \--*  JTRUE     void
               [000019] ------------                 |  /--*  CNS_INT   int    0
               [000020] ------------                 \--*  NE        int
               [000018] ------------                    \--*  LCL_VAR   int    V03 loc3

when we do Folding fp operator with constant nodes into a fp constant for [000011] in https://github.com/dotnet/coreclr/blob/f69b623c9f41c5d358d52f40e5cdad808b005bed/src/jit/gentree.cpp#L14136-L14145 on x87 c++ compiler generates:

0F407222  fld         dword ptr [f1]  
0F407225  fdiv        dword ptr [f2] <- push a result on stack as qword.  
0F407228  fstp        qword ptr [d1] <- read the result as qword(double).

that leaves d1=7.0064923216240854e-46 and then we propogate it as LCL_VAR int V03 loc3 and fold 0 != 7.0064923216240854e-46 ( [000014] ------------ | /--* EQ int) to false.

on x86 with SIMD C++ generates:

00007FFA95D2A819  movss       xmm0,dword ptr [f1]  
00007FFA95D2A81F  divss       xmm0,dword ptr [f2]  <- stores the single-precision floating-point result in the destination operand.
00007FFA95D2A828  cvtss2sd    xmm0,xmm0  
00007FFA95D2A82C  movsd       mmword ptr [d1],xmm0 

that sets d0=0 and works fine.

So on x87 the test return 1 (because everything is collapsed into on false branch), on other platforms 100.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 16 (16 by maintainers)

Commits related to this issue

Most upvoted comments

FWIW it’s possible to get even the latest VC++ x86 compiler to do math in x87 if it’s optimizing and trying to accomodate the unfortunate ABI:

>type t.cpp
double divdff(float a, float b)
{
    return a/b;
}


>cl -c -O2 t.cpp -FAs
Microsoft (R) C/C++ Optimizing Compiler Version 19.16.26830 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

t.cpp

>dumpbin /disasm t.obj
Microsoft (R) COFF/PE Dumper Version 14.16.26830.0
Copyright (C) Microsoft Corporation.  All rights reserved.


Dump of file t.obj

File Type: COFF OBJECT

?divdff@@YANMM@Z (double __cdecl divdff(float,float)):
  00000000: D9 44 24 04        fld         dword ptr [esp+4]
  00000004: D8 74 24 08        fdiv        dword ptr [esp+8]
  00000008: C3                 ret

It may not be relevant to this discussion. If not, FYI.