Bug 95419

Summary: [HSW][regression][bisect] RPG Maker game gives "invalid floating point operation" at startup
Product: Mesa Reporter: dnord <pinguin255>
Component: Drivers/DRI/i965Assignee: Ian Romanick <idr>
Status: RESOLVED FIXED QA Contact: Intel 3D Bugs Mailing List <intel-3d-bugs>
Severity: normal    
Priority: medium CC: adi, joanbe, mattst88
Version: 11.2   
Hardware: x86-64 (AMD64)   
OS: All   
Whiteboard:
i915 platform: i915 features:
Bug Depends on:    
Bug Blocks: 96253    
Attachments: patch
patch2
patch3
patch4
orbment crash log

Description dnord 2016-05-16 08:05:25 UTC
Japanese 2D RPG Maker game called Yume Nikki (under wine) gives "invalid floating point operation" message at startup. Bisecting this discovers regression in DRI i965 driver (software rendering and non-intel videocards are working fine), introduced by https://cgit.freedesktop.org/mesa/mesa/commit/?id=af5ca43f2676bff7499f93277f908b681cb821d0
11.1.3 is unaffected, 11.2.0 is affected and latest 11.2.2 is still affected.

Distro: Ubuntu 16.04
Kernel: 4.6
Hardware: Thinkpad T440p
CPU/Video: i5-4300M (Haswell, HD 4600)
Mesa: 11.2.0 (and 11.2.2)

This game is freeware and can be found at http://archive.uboachan.net/media/src/Yume_Nikki.rar to reproduce the bug.
Comment 1 dnord 2016-05-16 08:07:45 UTC
>gives "invalid floating point operation" message at startup
And just closes after that.
Comment 2 Matt Turner 2016-05-16 17:58:11 UTC
Thank you very much for the bug report and the bisection -- that's very helpful and makes our lives much easier :)

The only floating-point operations added in that patch are

+      unsigned hs_additional = (unsigned)
+         round(hs_wants * (((double) remaining_space) / total_wants));

so I suspect we're dividing by 0.

I'll attach a patch for you to test.
Comment 3 Matt Turner 2016-05-16 18:18:14 UTC
Created attachment 123793 [details] [review]
patch

Please give this a try.
Comment 4 dnord 2016-05-16 21:06:00 UTC
Created attachment 123797 [details] [review]
patch2

Sorry, your patch did not helped, but your suggestion is right: it's division by zero, when total_wants = 0. This code uses unnecessary math tricks for trivial things for some reason. If one does the math it can be easily seen, that (remaining_space/total_wants) ratio always remains the same, so this error-prone hacky code can be replaced by equivalent clear one, which is not prone to division-by-zero errors. Patch attached (works for me). Please review.
Comment 5 Matt Turner 2016-05-16 21:26:21 UTC
(In reply to dnord from comment #4)
> Created attachment 123797 [details] [review] [review]
> patch2
> 
> Sorry, your patch did not helped, but your suggestion is right: it's
> division by zero, when total_wants = 0. This code uses unnecessary math
> tricks for trivial things for some reason. If one does the math it can be
> easily seen, that (remaining_space/total_wants) ratio always remains the
> same, so this error-prone hacky code can be replaced by equivalent clear
> one, which is not prone to division-by-zero errors. Patch attached (works
> for me). Please review.

Thank you. That looks good.

If I might make two suggestions, which you're welcome to do as a follow on patch: (1) replace the (double) cast with (float) (commit af5ca43 added new code using doubles, which was a rebasing mistake on top of commit c1da1570), and (2) use lroundf() instead of round().

Can you please send the patch to mesa-dev@lists.freedesktop.org using git send-email? Also, please make sure git is configured with your name and email.

Thanks!
Comment 6 dnord 2016-05-16 23:28:55 UTC
>Can you please send the patch to mesa-dev@lists.freedesktop.org using git send-email?
Done.
Comment 7 Matt Turner 2016-05-18 18:07:47 UTC
Pushed as

commit 2a8aa1e3deb99a1ae16d942318da648c1327ece5
Author: Ardinartsev Nikita <ardinar23@gmail.com>
Date:   Tue May 17 02:27:22 2016 +0300

    i965/urb: fixes division by zero

Thanks!
Comment 8 Matt Turner 2016-05-18 19:48:34 UTC
Sorry, I had to revert this. It broke various piglit tests.

The problem is that total_wants and remaining_space are updated between each calculation, making the transformation your patch made incorrect.
Comment 9 Matt Turner 2016-05-18 19:53:46 UTC
If you want to investigate, some of the tests broken are:

IVB GT1:

tests/spec/arb_tessellation_shader/execution/variable-indexing/tcs-output-array-vec2-index-wr.shader_test -auto

HSW GT1:

tests/spec/glsl-1.50/execution/varying-struct-basic-vs-gs.shader_test -auto

HSW GT2:

bin/glsl-1.50-geometry-end-primitive 33 -auto -fbo

HSW GT3:

tests/spec/arb_tessellation_shader/execution/variable-indexing/vs-output-array-vec4-index-wr-before-tcs.shader_test

BDW GT2:

tests/spec/arb_tessellation_shader/execution/variable-indexing/tcs-output-array-vec2-index-rd-after-barrier.shader_test -auto


They all trigger the assertion immediately below, so they should be easy to reproduce

   /* Sanity check that we haven't over-allocated. */
   assert(push_constant_chunks +
          vs_chunks + hs_chunks + ds_chunks + gs_chunks <= urb_chunks);

You can also use INTEL_DEVID_OVERRIDE=... with a PCI ID of one of these chips (found in include/pci_ids/i965_pci_ids.h) to exercise the code paths for a chip you don't have.
Comment 10 dnord 2016-05-18 20:00:05 UTC
(In reply to Matt Turner from comment #8)
> Sorry, I had to revert this. It broke various piglit tests.
> 
> The problem is that total_wants and remaining_space are updated between each
> calculation, making the transformation your patch made incorrect.
I know they are updated, but I don't understand why it's incorrect. I mean look at this:
>remaining_space -= vs_additional;
>total_wants -= vs_wants;
Because vs_additional/vs_wants = remaining_space/total_wants, remaining_space/total_wants should remain the same after update and code should be equivalent, up to rounding errors. Maybe it's just rounding up error and after replace "round" with "floor" it will work? (can you test this?) Anyway I try to look at this more closely.
Comment 11 Kenneth Graunke 2016-05-18 21:35:37 UTC
Right, I suspect it's just a rounding error.  Other portions of the code solve that problem by allocating rounded blocks for all but one stage, and just allocating the remaining space to that final stage.  floor might also be a reasonable plan.
Comment 12 dnord 2016-05-18 23:37:37 UTC
I don't understand the whole picture, but how often actually it's not enough space (remaining_space < total_wants)? If it's usually enough space to allocate, isn't it reasonable to distinguish (remaining_space >= total_wants) as a separate case, which is not uses any floating-point arithmetic and just allocates what was asked and for second case use original code, but with two explicit checks added for total_wants==0?
Or use solution with "floor" for second case only, as it is simpler. Because solution with "floor" may not allocate all requested space even in trivial case (remaining_space >= total_wants). I don't know, is it acceptable to use "floor" and allocate slightly less space than possible in second case (when remaining_space < total_wants).
Comment 13 dnord 2016-05-19 00:16:30 UTC
>Because solution with "floor" may not allocate all requested space even in trivial case (remaining_space >= total_wants).
My mistake, I thought about more complicated rounding problems. Floor will work fine when it's enough space.
Comment 14 dnord 2016-05-19 21:35:43 UTC
Created attachment 123926 [details] [review]
patch3

Unfortunately I had not been able to reproduce any broken piglet tests, which were mentioned, introduced by this patch (obviously I tested only on my hardware, but piglet results were equal). Anyway I propose two patches, which solve this bug. First is not using any floating point arithmetic, just integer division, which is equal to rounding down and therefore should not allocate more chunks than available. Second is straightforward obvious fix to the original code - checking for null before two divisions.
Comment 15 dnord 2016-05-19 21:36:14 UTC
Created attachment 123927 [details] [review]
patch4
Comment 16 Matt Turner 2016-05-31 17:00:46 UTC
*** Bug 96289 has been marked as a duplicate of this bug. ***
Comment 17 Armaël Guéneau 2016-06-06 09:56:56 UTC
Created attachment 124355 [details]
orbment crash log

FYI, I encountered this same bug while trying to run orbment (https://github.com/Cloudef/orbment), with it crashing at startup raising SIGFPE. I attached the corresponding log.

I tested both patches proposed by dnord (patch3 and patch4), and both seem to solve the issue for me. Thanks! Hopefully one of them gets merged at some point :).
Comment 18 Adrian 2016-06-18 12:41:13 UTC
I can also confirm patch4 fixes the SIGFPE issue. I ran the pigtail tests and they passed with both patches (I also couldn't reproduce the test failure btw).
Comment 19 Matt Turner 2016-06-23 17:13:24 UTC
(In reply to dnord from comment #14)
> Created attachment 123926 [details] [review] [review]
> patch3
> 
> Unfortunately I had not been able to reproduce any broken piglet tests,
> which were mentioned, introduced by this patch (obviously I tested only on
> my hardware, but piglet results were equal). Anyway I propose two patches,
> which solve this bug. First is not using any floating point arithmetic, just
> integer division, which is equal to rounding down and therefore should not
> allocate more chunks than available. Second is straightforward obvious fix
> to the original code - checking for null before two divisions.

Thank you. I've tested and committed this version of the patch.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.