Fun with the 387

FiledÂ GCC bug 39228:

#include <stdio.h>
#include <math.h>
int main()
{
Â Â Â Â Â Â Â  double a= 10.0;
Â Â Â Â Â Â Â  double b= 1e+308;
Â Â Â Â Â Â Â  printf("%d %d %dn", isinf(a*b), __builtin_isinf(a*b), __isinf(a*b));
Â Â Â Â Â Â Â  return 0;
}

mtaylor@drizzle-dev:~$ gcc -o test test.c
mtaylor@drizzle-dev:~$ ./test
0 0 1
mtaylor@drizzle-dev:~$ gcc -o test test.c -std=c99
mtaylor@drizzle-dev:~$ ./test
1 0 1
mtaylor@drizzle-dev:~$ gcc -o test test.cÂ Â -mfpmath=sse -march=pentium4
mtaylor@drizzle-dev:~$ ./test
1 1 1
mtaylor@drizzle-dev:~$ g++ -o test test.c
mtaylor@drizzle-dev:~$ ./test
1 0 1

Originally I found the simple isinf() case to be different on x86 than x86-64, ppc32 and sparc (32 and 64).

After more research, I found that x86-64 uses the sse instructions to do it (and using sse is the only way for __builtin_isinf() to produce correct results). For the g++ built version, it calls __isinf() instead of inlining (and as can be seen, the __isinf() version is always correct).

Specifically, it’s because the optimised 387 code is doing the math in double extended precision inside the FPU. 10.0*1e308 fits in 80bits but not in 64bit. Any code that forces it to be stored and loaded gets the correct result too. e.g.

mtaylor@drizzle-dev:~$ cat test-simple.c

#include <stdio.h>
#include <math.h>
int main()
{
Â Â Â Â Â Â Â  double a= 10.0;
Â Â Â Â Â Â Â  double b= 1e+308;
Â Â Â  volatileÂ Â Â  double c= a*b;
Â Â Â Â Â Â Â  printf("%dn", isinf(c));
Â Â Â Â Â Â Â  return 0;
}

mtaylor@drizzle-dev:~$ gcc -o test-simple test-simple.c
mtaylor@drizzle-dev:~$ ./test-simple
1

With this code you can easily see the load and store:

Â 8048407:Â Â Â Â Â Â  dc 0d 18 85 04 08Â Â Â Â Â Â  fmullÂ  0x8048518 804840d:Â Â Â Â Â Â  dd 5d f0Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  fstplÂ  -0x10(%ebp)
Â 8048410:Â Â Â Â Â Â  dd 45 f0Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  fldlÂ Â  -0x10(%ebp)
Â 8048413:Â Â Â Â Â Â  d9 e5Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  fxam

While if you remove volatile, the load and store doesn’t happen (at least on -O3, on -O0 it hasn’t been optimised away):

Â 8048407:Â Â Â Â Â Â  dc 0d 18 85 04 08Â Â Â Â Â Â  fmullÂ  0x8048518
Â 804840d:Â Â Â Â Â Â  c7 44 24 04 10 85 04Â Â Â  movlÂ Â  $0x8048510,0x4(%esp)
Â 8048414:Â Â Â Â Â Â  08
Â 8048415:Â Â Â Â Â Â  c7 04 24 01 00 00 00Â Â Â  movlÂ Â  $0x1,(%esp)
Â 804841c:Â Â Â Â Â Â  d9 e5Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  fxam

This is also a regression from 4.2.4 as it just calls isinf() and doesn’t expand the 387 code inline. My guess is the 387 optimisation was added in 4.3.

Recommended fix: store and load in the 387 version so to operate on same precision as elsewhere.

Now I just have to make a patch I like that makes Drizzle behave because of this (showed up as a failure in the SQL func_math test) and then submit to MySQL as well… as this may happen there if “correctly” built.

Ramblings

Ramblings which occasionally resemble reality. This is the blog of Stewart Smith.

Fun with the 387

Like this:

Related

Leave a ReplyCancel reply

Share this:

Like this:

Related

Leave a ReplyCancel reply