Summary: | [llvmpipe] lp_test_arit fails on old CPUs | ||
---|---|---|---|
Product: | Mesa | Reporter: | ken moffat <zarniwhoop> |
Component: | Other | Assignee: | Roland Scheidegger <sroland> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | CC: | james.cook, jfonseca, lordheavym, nikoli, vcunat |
Version: | 10.0 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
/proc/cpuinfo from james.cook@utoronto.ca
proposed fix |
Description
ken moffat
2013-08-02 14:33:18 UTC
Which llvm version are you using? It works for me with version 3.2 and 3.3. (In reply to comment #1) > Which llvm version are you using? > It works for me with version 3.2 and 3.3. 3.3 (I'm using an r600 so I had to upgrade from 3.2 to build mesa-9.2) Does this also happen on master? If so do you have a cpu with sse2 but not sse3 by chance? I think there might potentially be a problem there with some tests because we set the FTZ but not the DAZ flag (though just about all cpus except some very early p4 support that flag even with only just sse2, but since trying to set it if it's not supported results in a crash we don't try). Some of these tests use denorms as inputs and I wouldn't expect reference to really match generated code in this case (certainly those failing here all do use denorms otherwise the reference would make no sense). Though I am actually surprised to see reference giving values which look right for "ordinary" denormal handling (as FTZ would still be set) but it would depend entirely on what exactly the math library function does. In any case the failures should be pretty harmless, but I don't know what would be the best way to fix them (other than just to get rid of the denorm test cases). (In reply to comment #3) > Does this also happen on master? > If so do you have a cpu with sse2 but not sse3 by chance? I think there > might potentially be a problem there with some tests because we set the FTZ > but not the DAZ flag (though just about all cpus except some very early p4 > support that flag even with only just sse2, but since trying to set it if > it's not supported results in a crash we don't try). > Some of these tests use denorms as inputs and I wouldn't expect reference to > really match generated code in this case (certainly those failing here all > do use denorms otherwise the reference would make no sense). Though I am > actually surprised to see reference giving values which look right for > "ordinary" denormal handling (as FTZ would still be set) but it would depend > entirely on what exactly the math library function does. In any case the > failures should be pretty harmless, but I don't know what would be the best > way to fix them (other than just to get rid of the denorm test cases). yes and yes. model name : AMD Phenom(tm) II X4 965 Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm 3dnowext 3dnow constant_tsc rep_good nopl nonstop_tsc extd_apicid pni monitor cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt nodeid_msr hw_pstate npt lbrv svm_lock nrip_save FWIW I've sent out a patch which should address this http://lists.freedesktop.org/archives/mesa-dev/2013-August/042729.html) but honestly I don't think it's 9.2 worthy. (In reply to comment #5) > FWIW I've sent out a patch which should address this > http://lists.freedesktop.org/archives/mesa-dev/2013-August/042729.html) but > honestly I don't think it's 9.2 worthy. Didn't seem to make any difference. Pasted the patch from the link (no html in it) and applied. Reran make, same results in the testsuite. Ran 'make distclean', reran configure, make, make check but still the same, (In reply to comment #6) > (In reply to comment #5) > > FWIW I've sent out a patch which should address this > > http://lists.freedesktop.org/archives/mesa-dev/2013-August/042729.html) but > > honestly I don't think it's 9.2 worthy. > > Didn't seem to make any difference. Pasted the patch from the link (no html > in it) and applied. Reran make, same results in the testsuite. Ran 'make > distclean', reran configure, make, make check but still the same, That's odd, daz test seemed to work here (though of course I had to hack around the ifdefs and conditions). What does it print out if you set GALLIUM_DUMP_CPU=1 env var (with a debug build)? Though I guess depending on math library it might in theory not work neither, I suspect there's no guarantee if you use non-standard flags that the result has to be correct according to these non-standard flags. Thinking about this I suspect it would actually never work on x86-32 (on all cpus) since the math library might not use sse at all hence be unaffected by this flag. It is really more of a test case problem though (but trying to set DAZ should still make sense). (In reply to comment #7) > What does it print out if you set GALLIUM_DUMP_CPU=1 env var (with a debug > build)? How do I get output ? I've got the following in .xinitrc: export GALLIUM_LOG_FILE=/home/ken/gallium.log export GALLIUM_PRINT_OPTIONS=1 export GALLIUM_DUMP_CPU=1 (started with just DUMP_CPU and capturing stderr). I've tried running the xscreensaver-demo previews (i.e. fullscreen) for GLHanoi and GLPlanet, also ran glxinfo and glxgears - but gallium.log doesn't get created. This is for a build of master from 1st August plus your patch, with both CFLAGS and CXXFLAGS not set in the environment, so the standard CFLAGS: -g -O2 -Wall -std=c99 -Werror=implicit-function-declaration -Werror=missing-prototypes -fno-strict-aliasing -fno-builtin-memcmp and CXXFLAGS: -g -O2 -Wall -fno-strict-aliasing -fno-builtin-memcmp. (In reply to comment #8) > (In reply to comment #7) > > > What does it print out if you set GALLIUM_DUMP_CPU=1 env var (with a debug > > build)? > > How do I get output ? Just place it in the environment, i.e. GALLIUM_DUMP_CPU=1 ./lp_test_arit (or glxgears or whatever) should be enough. Though actually since you're using x86_64 it should definitely set the has_daz flag in any case. sorry, nothing. Maybe something in the way I configured it ? PATH=$PATH:/opt/llvm-33/bin/ ./configure --prefix=/usr --sysconfdir=/etc --enable-texture-float --enable-gles1 --enable-gles2 --enable-openvg --enable-osmesa --enable-xa --enable-gbm --enable-gallium-egl --enable-gallium-gbm --enable-r600-llvm-compiler --enable-glx-tls --with-egl-platforms="drm,x11" --with-gallium-drivers=r600,svga,swrast --enable-gallium-llvm --with-llvm-shared-libs --enable-gallium-tests (In reply to comment #10) > sorry, nothing. Maybe something in the way I configured it ? Yes as said this will only work with a debug build. --enable-debug should do it (though I only tried with scons). (In reply to comment #11) > (In reply to comment #10) > > sorry, nothing. Maybe something in the way I configured it ? > > Yes as said this will only work with a debug build. --enable-debug should do > it (though I only tried with scons). Oh. I had assumed -g was a debug build. Fails to build: Run 'make' to build Mesa Making all in src make[1]: Entering directory `/scratch/working/mesa-master-20130801/src' Making all in gtest make[2]: Entering directory `/scratch/working/mesa-master-20130801/src/gtest' make[2]: Nothing to be done for `all'. make[2]: Leaving directory `/scratch/working/mesa-master-20130801/src/gtest' Making all in mapi make[2]: Entering directory `/scratch/working/mesa-master-20130801/src/mapi' Making all in glapi/gen make[3]: Entering directory `/scratch/working/mesa-master-20130801/src/mapi/glapi/gen' GEN ../../../../src/mapi/glapi/glprocs.h GEN ../../../../src/mapi/glapi/glapitemp.h GEN ../../../../src/mapi/glapi/glapi_mapi_tmp.h GEN ../../../../src/mapi/glapi/glapitable.h /bin/sh: line 1: 17440 Segmentation fault python2 gl_table.py -f ./gl_and_es_API.xml > ../../../../src/mapi/glapi/glapitable.h make[3]: *** [../../../../src/mapi/glapi/glapitable.h] Error 139 make[3]: *** Waiting for unfinished jobs.... make[3]: Leaving directory `/scratch/working/mesa-master-20130801/src/mapi/glapi/gen' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory `/scratch/working/mesa-master-20130801/src/mapi' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/scratch/working/mesa-master-20130801/src' make: *** [all-recursive] Error 1 (In reply to comment #12) > Oh. I had assumed -g was a debug build. Some debug features depend on explicitly defined DEBUG var. > > Fails to build: > Run 'make' to build Mesa > GEN ../../../../src/mapi/glapi/glapitable.h > /bin/sh: line 1: 17440 Segmentation fault python2 gl_table.py -f No idea why this would crash. *** Bug 67910 has been marked as a duplicate of this bug. *** I can reproduce the problem on my machine. I'm using the tarball at ftp://ftp.freedesktop.org/pub/mesa/${version}/MesaLib-${version}.tar.bz2 , where version is 9.2.2, with some distribution-specific patches and configuration options (NixOS x-updates branch). If you think these might be interfering, let me know and I'll see if I can build without the changes. Here's my output for GALLIUM_DUMP_CPU=1 ./lp_test_arit (with LD_LIBRARY_PATH set for annoying reasons): $ LD_LIBRARY_PATH=/tmp/nix-build-mesa-noglu-9.2.2.drv-0/Mesa-9.2.2/src/gallium/auxiliary/gallivm/.libs/lp_bld_init.o0000000000000000 GALLIUM_DUMP_CPU=1 ./lp_test_arit util_cpu_caps.nr_cpus = 3 util_cpu_caps.x86_cpu_type = 9 util_cpu_caps.cacheline = 64 util_cpu_caps.has_tsc = 1 util_cpu_caps.has_mmx = 1 util_cpu_caps.has_mmx2 = 1 util_cpu_caps.has_sse = 1 util_cpu_caps.has_sse2 = 1 util_cpu_caps.has_sse3 = 1 util_cpu_caps.has_ssse3 = 0 util_cpu_caps.has_sse4_1 = 0 util_cpu_caps.has_sse4_2 = 0 util_cpu_caps.has_avx = 0 util_cpu_caps.has_3dnow = 1 util_cpu_caps.has_3dnow_ext = 1 util_cpu_caps.has_altivec = 0 floor(-0): ref = -1, out = 0, precision = -0.000000 bits, FAIL ceil(0): ref = 1, out = 0, precision = -0.000000 bits, FAIL fract(-0): ref = 0.99999994, out = -0, precision = -0.000000 bits, FAIL Here's the end of the testing output, from before I ran the above command: Testing PIPE_FORMAT_B4G4R4X4_UNORM (unorm8) ... PASS: lp_test_format floor(-0): ref = -1, out = 0, precision = -0.000000 bits, FAIL ceil(0): ref = 1, out = 0, precision = -0.000000 bits, FAIL fract(-0): ref = 0.99999994, out = -0, precision = -0.000000 bits, FAIL FAIL: lp_test_arit PASS: lp_test_blend PASS: lp_test_conv hello, world print 5 6: 5 6 PASS: lp_test_printf ======================================================================== 1 of 5 tests failed Please report to https://bugs.freedesktop.org/enter_bug.cgi?product=Mesa ======================================================================== Here's my /proc/cpuinfo: processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz stepping : 9 microcode : 0x12 cpu MHz : 1200.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 5787.10 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz stepping : 9 microcode : 0x12 cpu MHz : 1200.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 5786.57 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz stepping : 9 microcode : 0x12 cpu MHz : 1200.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 5786.58 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 58 model name : Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz stepping : 9 microcode : 0x12 cpu MHz : 1200.000 cache size : 4096 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms bogomips : 5786.58 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: Please let me know how else I can help. As for the patches, IMHO the only interferrable is the one adding --enable-shared-gallium, which was taken from Ubuntu (I think). (In reply to comment #15) > I can reproduce the problem on my machine. > > I'm using the tarball at > ftp://ftp.freedesktop.org/pub/mesa/${version}/MesaLib-${version}.tar.bz2 , > where version is 9.2.2, with some distribution-specific patches and > configuration options (NixOS x-updates branch). If you think these might be > interfering, let me know and I'll see if I can build without the changes. > > > Here's my output for GALLIUM_DUMP_CPU=1 ./lp_test_arit (with LD_LIBRARY_PATH > set for annoying reasons): > > $ > LD_LIBRARY_PATH=/tmp/nix-build-mesa-noglu-9.2.2.drv-0/Mesa-9.2.2/src/gallium/ > auxiliary/gallivm/.libs/lp_bld_init.o0000000000000000 GALLIUM_DUMP_CPU=1 > ./lp_test_arit > util_cpu_caps.nr_cpus = 3 > util_cpu_caps.x86_cpu_type = 9 > util_cpu_caps.cacheline = 64 > util_cpu_caps.has_tsc = 1 > util_cpu_caps.has_mmx = 1 > util_cpu_caps.has_mmx2 = 1 > util_cpu_caps.has_sse = 1 > util_cpu_caps.has_sse2 = 1 > util_cpu_caps.has_sse3 = 1 > util_cpu_caps.has_ssse3 = 0 > util_cpu_caps.has_sse4_1 = 0 > util_cpu_caps.has_sse4_2 = 0 > util_cpu_caps.has_avx = 0 > util_cpu_caps.has_3dnow = 1 > util_cpu_caps.has_3dnow_ext = 1 > util_cpu_caps.has_altivec = 0 > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 58 > model name : Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz > stepping : 9 > microcode : 0x12 > cpu MHz : 1200.000 > cache size : 4096 KB > physical id : 0 > siblings : 4 > core id : 0 > cpu cores : 2 > apicid : 0 > initial apicid : 0 > fpu : yes > fpu_exception : yes > cpuid level : 13 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat > pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm > constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc > aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 > xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx > f16c rdrand lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi > flexpriority ept vpid fsgsbase smep erms Hmm the util_cpu_caps are totally busted, I wonder what's up with that... Created attachment 90563 [details] /proc/cpuinfo from james.cook@utoronto.ca Whoops, I must have sent cpuinfo from the laptop I sent the e-mail from and then forgot about this thread. I've attached /proc/cpuinfo from the computer I ran the test on. Note, I might be running a different kernel version now compared to before; not sure whether that affects the contents of /proc/cpuinfo. Still failing on mesa-10.0.4 + llvm-3.4. Still failing on mesa-10.1.3 + llvm-3.4, though my error is slightly different: FAIL: lp_test_arit ================== rcp(5.8799997e-39): ref = 1.70068035e+38, out = inf, precision = -inf bits, FAIL rsqrt(5.8799997e-39): ref = 1.30410138e+19, out = inf, precision = -inf bits, FAIL floor(-1.40129846e-45): ref = -1, out = -0, precision = -0.000000 bits, FAIL ceil(1.40129846e-45): ref = 1, out = 0, precision = -0.000000 bits, FAIL fract(1.40129846e-45): ref = 1.40129846e-45, out = 0, precision = -0.000000 bits, FAIL fract(-1.40129846e-45): ref = 0.99999994, out = 0, precision = -0.000000 bits, FAIL fract(5.8799997e-39): ref = 5.8799997e-39, out = 0, precision = -0. In one of my systems mesa-10.2.4 fails this test too: # cat ./work/Mesa-10.2.4-abi_x86_64.amd64/src/gallium/drivers/llvmpipe/lp_test_arit.log floor(-0): ref = -1, out = 0, precision = -0.000000 bits, FAIL ceil(0): ref = 1, out = 0, precision = -0.000000 bits, FAIL fract(-0): ref = 0.99999994, out = -0, precision = -0.000000 bits, FAIL I have hardened Gentoo Linux amd64 stable, llvm-3.4.2, kernel 3.14.4-hardened-r1, AMD A4-3300M APU Still failing in Mesa-10.3.0 + llvm-3.4 :-( ... and llvm-3.5 didn't help either. Created attachment 106754 [details] [review] proposed fix Could you try this fix? Note the error is actually with the _reference_, not the actual driver code (this is because we want no denormals and are switching them off in the driver), so llvm updates aren't going to do anything. I guess that not everyone gets exactly the same error is just due to what the math libraries / compilers are doing (they expect "ordinary" denormal handling, hence they are not required to honor our differently set cpu flags and the results are therefore kinda undefined). In any case, this is really more of a cosmetic error. (In reply to Roland Scheidegger from comment #25) > Could you try this fix? It's straighfoward to repro the bug on any modern CPU with https://software.intel.com/en-us/articles/intel-software-development-emulator : $ GALLIUM_DUMP_CPU=1 /var/lib/hudson/tools/lin64/sde/sde64 -p4p -- build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_arit util_cpu_caps.nr_cpus = 8 util_cpu_caps.x86_cpu_type = 8 util_cpu_caps.cacheline = 64 util_cpu_caps.has_tsc = 1 util_cpu_caps.has_mmx = 1 util_cpu_caps.has_mmx2 = 1 util_cpu_caps.has_sse = 1 util_cpu_caps.has_sse2 = 1 util_cpu_caps.has_sse3 = 1 util_cpu_caps.has_ssse3 = 0 util_cpu_caps.has_sse4_1 = 0 util_cpu_caps.has_sse4_2 = 0 util_cpu_caps.has_avx = 0 util_cpu_caps.has_avx2 = 0 util_cpu_caps.has_f16c = 0 util_cpu_caps.has_popcnt = 0 util_cpu_caps.has_3dnow = 0 util_cpu_caps.has_3dnow_ext = 0 util_cpu_caps.has_xop = 0 util_cpu_caps.has_altivec = 0 util_cpu_caps.has_daz = 1 floor(-0): ref = -1, out = 0, precision = -0.000000 bits, FAIL ceil(0): ref = 1, out = 0, precision = -0.000000 bits, FAIL fract(-0): ref = 0.99999994, out = -0, precision = -0.000000 bits, FAIL And I've verified that Roland's patch fixes it: $ GALLIUM_DUMP_CPU=1 /var/lib/hudson/tools/lin64/sde/sde64 -p4p -- build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_arit util_cpu_caps.nr_cpus = 8 util_cpu_caps.x86_cpu_type = 8 util_cpu_caps.cacheline = 64 util_cpu_caps.has_tsc = 1 util_cpu_caps.has_mmx = 1 util_cpu_caps.has_mmx2 = 1 util_cpu_caps.has_sse = 1 util_cpu_caps.has_sse2 = 1 util_cpu_caps.has_sse3 = 1 util_cpu_caps.has_ssse3 = 0 util_cpu_caps.has_sse4_1 = 0 util_cpu_caps.has_sse4_2 = 0 util_cpu_caps.has_avx = 0 util_cpu_caps.has_avx2 = 0 util_cpu_caps.has_f16c = 0 util_cpu_caps.has_popcnt = 0 util_cpu_caps.has_3dnow = 0 util_cpu_caps.has_3dnow_ext = 0 util_cpu_caps.has_xop = 0 util_cpu_caps.has_altivec = 0 util_cpu_caps.has_daz = 1 $ Roland, I just have a few suggestiongs for the patch: - let's move the FTZ/DAZ code two an helpers - we should call the helper also on the results - we should leave the sign bit alone, ie, `val.ui &= 0xff800000` -> `val.ui &= 0x7f800000`. (In reply to José Fonseca from comment #26) > (In reply to Roland Scheidegger from comment #25) > > Could you try this fix? > > It's straighfoward to repro the bug on any modern CPU with > https://software.intel.com/en-us/articles/intel-software-development- > emulator : > > $ GALLIUM_DUMP_CPU=1 /var/lib/hudson/tools/lin64/sde/sde64 -p4p -- > build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_arit Ah I didn't think about using it for simulating environments with features you have but don't want just the other way around... > Roland, I just have a few suggestiongs for the patch: > - let's move the FTZ/DAZ code two an helpers > - we should call the helper also on the results Makes sense I guess. Though I'd suspect since there's tolerance for the results it shouldn't matter. > - we should leave the sign bit alone, ie, `val.ui &= 0xff800000` -> `val.ui > &= 0x7f800000`. Hmm the code as is does leave the sign bit alone. I'll send out a patch... (In reply to Roland Scheidegger from comment #27) > > - we should leave the sign bit alone, ie, `val.ui &= 0xff800000` -> `val.ui > > &= 0x7f800000`. > Hmm the code as is does leave the sign bit alone. You're quite right! I was thinking backwards. > I'll send out a patch... Thanks. Fixed by 8148a06b8fdb734f7f9a11ce787ee6505939fdaa. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.