Bug 83570 - Glyphy demo throws unhandled Integer division by zero exception
Summary: Glyphy demo throws unhandled Integer division by zero exception
Status: RESOLVED FIXED
Alias: None
Product: Mesa
Classification: Unclassified
Component: Other (show other bugs)
Version: 10.2
Hardware: x86 (IA32) Windows (All)
: medium major
Assignee: mesa-dev
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-09-06 20:18 UTC by rconde01
Modified: 2014-09-17 16:33 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
Windows 32-bit binary exhibiting the crash (337.63 KB, text/plain)
2014-09-06 20:18 UTC, rconde01
Details

Description rconde01 2014-09-06 20:18:05 UTC
Created attachment 105842 [details]
Windows 32-bit binary exhibiting the crash

Unhandled exception at 0x027B8FAA in glyphy-demo.exe: 0xC0000094: Integer division by zero.

>	opengl32.dll!lp_rast_shade_quads_mask(lp_rasterizer_task * task, const lp_rast_shader_inputs * inputs, unsigned int x, unsigned int y, unsigned int mask) Line 454	C
 	opengl32.dll!do_block_4_32_2(lp_rasterizer_task * task, const lp_rast_triangle * tri, const lp_rast_plane * plane, int x, int y, const __int64 * c) Line 61	C
 	opengl32.dll!do_block_16_32_2(lp_rasterizer_task * task, const lp_rast_triangle * tri, const lp_rast_plane * plane, int x, int y, const __int64 * c) Line 130	C
 	opengl32.dll!lp_rast_triangle_32_2(lp_rasterizer_task * task, const lp_rast_cmd_arg arg) Line 232	C
 	opengl32.dll!do_rasterize_bin(lp_rasterizer_task * task, const cmd_bin * bin, int x, int y) Line 595	C
 	opengl32.dll!rasterize_scene(lp_rasterizer_task * task, lp_scene * scene) Line 663	C
 	opengl32.dll!thread_function(void * init_data) Line 788	C
 	opengl32.dll!impl_thrd_routine(void * p) Line 144	C


Source code:

https://github.com/rconde01/glyphy/tree/add_some_demo_features
Comment 1 rconde01 2014-09-07 03:59:00 UTC
I haven't really gotten a grasp on the code base, but my assessment is that the issue is:

tgsi_exec.c

static void
micro_idiv(union tgsi_exec_channel *dst,
           const union tgsi_exec_channel *src0,
           const union tgsi_exec_channel *src1)
{
   dst->i[0] = src0->i[0] / src1->i[0];
   dst->i[1] = src0->i[1] / src1->i[1];
   dst->i[2] = src0->i[2] / src1->i[2];
   dst->i[3] = src0->i[3] / src1->i[3];
}

Section 4.1.3 of the 4.40 spec says:

"Division and multiplication operations resulting in overflow or underflow will not cause any exception but will result in an undefined value."

Therefore I believe it should be changed to:

static void
micro_idiv(union tgsi_exec_channel *dst,
           const union tgsi_exec_channel *src0,
           const union tgsi_exec_channel *src1)
{
   dst->i[0] = src1->i[0] != 0 ? src0->i[0] / src1->i[0] : 0;
   dst->i[1] = src1->i[1] != 0 ? src0->i[1] / src1->i[1] : 0;
   dst->i[2] = src1->i[2] != 0 ? src0->i[2] / src1->i[2] : 0;
   dst->i[3] = src1->i[3] != 0 ? src0->i[3] / src1->i[3] : 0;
}

I've confirmed that this fixes my particular issue.
Comment 2 rconde01 2014-09-07 15:55:25 UTC
Actually, that only fixes the non-llvm build. I'm not sure how to track down the equivalent piece of code in the llvm variant.
Comment 3 rconde01 2014-09-07 20:07:50 UTC
I figured out the LLVM side - I'll have a patch in the next few days.
Comment 4 Roland Scheidegger 2014-09-08 11:52:39 UTC
Ok, I just was thinking it is probably something along these lines, and could reproduce this with some hacked up piglit test doing integer division by zero. Though I got a SIGFPE instead, I guess that's due to different environment. Should be in lp_build_div().
Comment 5 Roland Scheidegger 2014-09-08 12:19:59 UTC
Actually, I think it would be a good idea if you could return 0xffffffff instead of 0. This is typically what GPUs will do, since the d3d10 docs claim this is required, both for div and mod - http://msdn.microsoft.com/en-us/library/windows/desktop/hh447244%28v=vs.85%29.aspx
Comment 6 Roland Scheidegger 2014-09-08 12:36:02 UTC
(In reply to comment #5)
> Actually, I think it would be a good idea if you could return 0xffffffff
> instead of 0. This is typically what GPUs will do, since the d3d10 docs
> claim this is required, both for div and mod -
> http://msdn.microsoft.com/en-us/library/windows/desktop/hh447244%28v=vs.
> 85%29.aspx

Though actually that might not make sense for signed div (which d3d10 doesn't have).
Comment 7 Roland Scheidegger 2014-09-08 12:43:26 UTC
(In reply to comment #6)
> (In reply to comment #5)
> > Actually, I think it would be a good idea if you could return 0xffffffff
> > instead of 0. This is typically what GPUs will do, since the d3d10 docs
> > claim this is required, both for div and mod -
> > http://msdn.microsoft.com/en-us/library/windows/desktop/hh447244%28v=vs.
> > 85%29.aspx
> 
> Though actually that might not make sense for signed div (which d3d10
> doesn't have).

And actually I didn't look closely enough, the unsigned versions already handle this (in udiv_emit_cpu() etc for gallivm), so it's just idiv_emit_cpu() which needs fixing. And zero is probably fine, I can't really think of a more useful value (maybe -MAX_INT or something would work too if anyone cares).
Comment 8 rconde01 2014-09-08 13:18:10 UTC
for the llvm version I was going to return bld->undef. I figure I might as well change tgsi_exec.c version to 0xffffffff for consistency...but I don't have strong feelings either way.
Comment 9 Roland Scheidegger 2014-09-08 15:53:36 UTC
(In reply to comment #8)
> for the llvm version I was going to return bld->undef. I figure I might as
> well change tgsi_exec.c version to 0xffffffff for consistency...but I don't
> have strong feelings either way.

I'm not sure how your code is going to look like, but I can't really see a way to return bld->undef since you will be required to replace the input vector with something sensible (similar to udiv_emit_cpu()) (*). That said, I guess using the same code as udiv_emit_cpu() but omitting the final "or" would also work - in which case the result for the idiv for x / 0 would be x / -1, why not...
I'm not really sure it makes sense to return 0xffffffff just for consistency, for uint div this value sort of makes sense, but for signed div it does not - MAX_INT or MIN_INT probably would make more sense. But since d3d10 doesn't support it, and glsl doesn't care, I think whatever is cheapest is ok (another cheap option would be to use a "andnot" instead of the "or" udiv does, in which case you'd get zero - MIN_INT/MAX_INT are going to be slightly more complicated).

(*) Because sse doesn't have int div, llvm actually breaks down the vector div to ordinary multiple int divs anyway (and yes it's bound to be very slow), thus breaking it down manually and do per element selection wouldn't really be too bad. Still wouldn't really make sense imho however.
Comment 10 rconde01 2014-09-08 17:36:05 UTC
Could be totally borked, but this is what I have at the moment:

   if (type.floating) {
      return LLVMBuildFDiv(builder, a, b, "");
   }
   else {
      //The spec says divide by zero is undefined, but shouldn't throw an exception
      LLVMValueRef zeromask = lp_build_cmp(bld,
                                           PIPE_FUNC_EQUAL,
                                           b,
                                           lp_build_zero(bld->gallivm,type));
      if (type.sign) {
         return lp_build_select(bld,
                                zeromask,
                                bld->undef,
                                LLVMBuildSDiv(builder, a, b, ""));
      }
      else {
         return lp_build_select(bld,
                                zeromask,
                                bld->undef,
                                LLVMBuildUDiv(builder, a, b, ""));
      }
   }

I'll take a look at udiv_emit_cpu
Comment 11 Roland Scheidegger 2014-09-08 20:07:09 UTC
(In reply to comment #10)
> Could be totally borked, but this is what I have at the moment:
> 
>    if (type.floating) {
>       return LLVMBuildFDiv(builder, a, b, "");
>    }
>    else {
>       //The spec says divide by zero is undefined, but shouldn't throw an
> exception
>       LLVMValueRef zeromask = lp_build_cmp(bld,
>                                            PIPE_FUNC_EQUAL,
>                                            b,
>                                            lp_build_zero(bld->gallivm,type));
>       if (type.sign) {
>          return lp_build_select(bld,
>                                 zeromask,
>                                 bld->undef,
>                                 LLVMBuildSDiv(builder, a, b, ""));
>       }
>       else {
>          return lp_build_select(bld,
>                                 zeromask,
>                                 bld->undef,
>                                 LLVMBuildUDiv(builder, a, b, ""));
>       }
>    }
> 
> I'll take a look at udiv_emit_cpu

Hmm that can't quite work since you still do the division by zero, you just don't use the result in the end.
Comment 12 rconde01 2014-09-11 00:50:41 UTC
Roland,
  Perhaps you can help me here. I'm looking at udiv_emit_cpu and I'm just not getting it. The comment says:

/* udiv by zero is guaranteed to return 0xffffffff */

but the way I read it, udiv by zero returns the numerator. Am I missing something?
Comment 13 Roland Scheidegger 2014-09-11 14:49:41 UTC
(In reply to comment #12)
> Roland,
>   Perhaps you can help me here. I'm looking at udiv_emit_cpu and I'm just
> not getting it. The comment says:
> 
> /* udiv by zero is guaranteed to return 0xffffffff */
> 
> but the way I read it, udiv by zero returns the numerator. Am I missing
> something?

Well in the end it is "ored" with the comparison mask, hence it's going to be all ones (comparison masks are either all ones or all zeros). So the first or is there to make sure the zero in the division gets replaced by something else (all ones because this is easiest) so it doesn't crash, and the second or is there to make sure the result is all ones (because d3d10 requires this result, but since d3d10 lacks signed integer division there's not really any such requirement for idiv).
Comment 14 rconde01 2014-09-11 15:21:35 UTC
(In reply to comment #13)
> (comparison masks are either all ones or all zeros).

AHH!!! That was the piece I was missing. My hair follicles thank you.
Comment 15 Roland Scheidegger 2014-09-17 16:33:23 UTC
I've fixed up the commit message and commited it, ffeb77c7b0552a8624e46e65d6347240ac5ae84d.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.