Japanese 2D RPG Maker game called Yume Nikki (under wine) gives "invalid floating point operation" message at startup. Bisecting this discovers regression in DRI i965 driver (software rendering and non-intel videocards are working fine), introduced by https://cgit.freedesktop.org/mesa/mesa/commit/?id=af5ca43f2676bff7499f93277f908b681cb821d0 11.1.3 is unaffected, 11.2.0 is affected and latest 11.2.2 is still affected. Distro: Ubuntu 16.04 Kernel: 4.6 Hardware: Thinkpad T440p CPU/Video: i5-4300M (Haswell, HD 4600) Mesa: 11.2.0 (and 11.2.2) This game is freeware and can be found at http://archive.uboachan.net/media/src/Yume_Nikki.rar to reproduce the bug.
>gives "invalid floating point operation" message at startup And just closes after that.
Thank you very much for the bug report and the bisection -- that's very helpful and makes our lives much easier :) The only floating-point operations added in that patch are + unsigned hs_additional = (unsigned) + round(hs_wants * (((double) remaining_space) / total_wants)); so I suspect we're dividing by 0. I'll attach a patch for you to test.
Created attachment 123793 [details] [review] patch Please give this a try.
Created attachment 123797 [details] [review] patch2 Sorry, your patch did not helped, but your suggestion is right: it's division by zero, when total_wants = 0. This code uses unnecessary math tricks for trivial things for some reason. If one does the math it can be easily seen, that (remaining_space/total_wants) ratio always remains the same, so this error-prone hacky code can be replaced by equivalent clear one, which is not prone to division-by-zero errors. Patch attached (works for me). Please review.
(In reply to dnord from comment #4) > Created attachment 123797 [details] [review] [review] > patch2 > > Sorry, your patch did not helped, but your suggestion is right: it's > division by zero, when total_wants = 0. This code uses unnecessary math > tricks for trivial things for some reason. If one does the math it can be > easily seen, that (remaining_space/total_wants) ratio always remains the > same, so this error-prone hacky code can be replaced by equivalent clear > one, which is not prone to division-by-zero errors. Patch attached (works > for me). Please review. Thank you. That looks good. If I might make two suggestions, which you're welcome to do as a follow on patch: (1) replace the (double) cast with (float) (commit af5ca43 added new code using doubles, which was a rebasing mistake on top of commit c1da1570), and (2) use lroundf() instead of round(). Can you please send the patch to mesa-dev@lists.freedesktop.org using git send-email? Also, please make sure git is configured with your name and email. Thanks!
>Can you please send the patch to mesa-dev@lists.freedesktop.org using git send-email? Done.
Pushed as commit 2a8aa1e3deb99a1ae16d942318da648c1327ece5 Author: Ardinartsev Nikita <ardinar23@gmail.com> Date: Tue May 17 02:27:22 2016 +0300 i965/urb: fixes division by zero Thanks!
Sorry, I had to revert this. It broke various piglit tests. The problem is that total_wants and remaining_space are updated between each calculation, making the transformation your patch made incorrect.
If you want to investigate, some of the tests broken are: IVB GT1: tests/spec/arb_tessellation_shader/execution/variable-indexing/tcs-output-array-vec2-index-wr.shader_test -auto HSW GT1: tests/spec/glsl-1.50/execution/varying-struct-basic-vs-gs.shader_test -auto HSW GT2: bin/glsl-1.50-geometry-end-primitive 33 -auto -fbo HSW GT3: tests/spec/arb_tessellation_shader/execution/variable-indexing/vs-output-array-vec4-index-wr-before-tcs.shader_test BDW GT2: tests/spec/arb_tessellation_shader/execution/variable-indexing/tcs-output-array-vec2-index-rd-after-barrier.shader_test -auto They all trigger the assertion immediately below, so they should be easy to reproduce /* Sanity check that we haven't over-allocated. */ assert(push_constant_chunks + vs_chunks + hs_chunks + ds_chunks + gs_chunks <= urb_chunks); You can also use INTEL_DEVID_OVERRIDE=... with a PCI ID of one of these chips (found in include/pci_ids/i965_pci_ids.h) to exercise the code paths for a chip you don't have.
(In reply to Matt Turner from comment #8) > Sorry, I had to revert this. It broke various piglit tests. > > The problem is that total_wants and remaining_space are updated between each > calculation, making the transformation your patch made incorrect. I know they are updated, but I don't understand why it's incorrect. I mean look at this: >remaining_space -= vs_additional; >total_wants -= vs_wants; Because vs_additional/vs_wants = remaining_space/total_wants, remaining_space/total_wants should remain the same after update and code should be equivalent, up to rounding errors. Maybe it's just rounding up error and after replace "round" with "floor" it will work? (can you test this?) Anyway I try to look at this more closely.
Right, I suspect it's just a rounding error. Other portions of the code solve that problem by allocating rounded blocks for all but one stage, and just allocating the remaining space to that final stage. floor might also be a reasonable plan.
I don't understand the whole picture, but how often actually it's not enough space (remaining_space < total_wants)? If it's usually enough space to allocate, isn't it reasonable to distinguish (remaining_space >= total_wants) as a separate case, which is not uses any floating-point arithmetic and just allocates what was asked and for second case use original code, but with two explicit checks added for total_wants==0? Or use solution with "floor" for second case only, as it is simpler. Because solution with "floor" may not allocate all requested space even in trivial case (remaining_space >= total_wants). I don't know, is it acceptable to use "floor" and allocate slightly less space than possible in second case (when remaining_space < total_wants).
>Because solution with "floor" may not allocate all requested space even in trivial case (remaining_space >= total_wants). My mistake, I thought about more complicated rounding problems. Floor will work fine when it's enough space.
Created attachment 123926 [details] [review] patch3 Unfortunately I had not been able to reproduce any broken piglet tests, which were mentioned, introduced by this patch (obviously I tested only on my hardware, but piglet results were equal). Anyway I propose two patches, which solve this bug. First is not using any floating point arithmetic, just integer division, which is equal to rounding down and therefore should not allocate more chunks than available. Second is straightforward obvious fix to the original code - checking for null before two divisions.
Created attachment 123927 [details] [review] patch4
*** Bug 96289 has been marked as a duplicate of this bug. ***
Created attachment 124355 [details] orbment crash log FYI, I encountered this same bug while trying to run orbment (https://github.com/Cloudef/orbment), with it crashing at startup raising SIGFPE. I attached the corresponding log. I tested both patches proposed by dnord (patch3 and patch4), and both seem to solve the issue for me. Thanks! Hopefully one of them gets merged at some point :).
I can also confirm patch4 fixes the SIGFPE issue. I ran the pigtail tests and they passed with both patches (I also couldn't reproduce the test failure btw).
(In reply to dnord from comment #14) > Created attachment 123926 [details] [review] [review] > patch3 > > Unfortunately I had not been able to reproduce any broken piglet tests, > which were mentioned, introduced by this patch (obviously I tested only on > my hardware, but piglet results were equal). Anyway I propose two patches, > which solve this bug. First is not using any floating point arithmetic, just > integer division, which is equal to rounding down and therefore should not > allocate more chunks than available. Second is straightforward obvious fix > to the original code - checking for null before two divisions. Thank you. I've tested and committed this version of the patch.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.